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SUMMARY 


The purpose of the research described in this report was to 
investigate methods for formally specifying and verifying the 
correctness of mathematical software (software which uses 
floating point numbers and arithmetic). The research carried out 
consisted primarily of the following activities: 

1. Reviewing previous attempts at modelling floating point 

arithmetic and formally specifying/verifying mathematical 
soitwar e . 


2 . 


3 . 


formulating a new model of floating point arithmetic, 
called the asymptotic paradigm", a language in which 
properties defined in the model such as "asymptotically 
close can be expressed, and a formal logical system to 
reason about this model. Our present choice of language 
and logic primitves is tentative. Further experimental 
verifications need to be explored. 

Investigating how the classical "Verification Condition 
eneration Approach" to program verification could be 
adapted to use the model. 


4 . 


5 . 


Performing a preliminary investigation of how the more 
innovative Programming Logic Approach" to 

verification could be adapted. 


program 


Applying the model to verifying several programs under both 
approaches; the programs chosen were simplified versions of 
actual mathematical software. 

Our new model of floating point computation is both intuitively 
clear and useful in verifying the programs we have looked at. 

ctual errors m floating point programs have been discovered. 
Interestingly, a logical error in an IMSL library routine 

FnPTP a^ 6 d by °V techni< l ues appears to be corrected by the 
FORTRAN compiler; running an interpreted BASIC transcription of 
the program does give bad test results. The building of 
verifying compilers which correct the logic of programs has 
always been a goal of program verification but in the present 
case the compiler s correction (the guard of a loop is changed 
from an incorrect to a correct form) probably arose from 
optimization considerations. The relationships between our model 
ot correctness and optimization remains to be investigated. Our 
model also has proved useful in uncovering new algorithms. 
Progress has been made towards integrating the new model into 
automatic verification systems. 


The research described in 
aerospace applications in 
floating point reals is 


this report has direct relevance 
which correctness of software over 
critical . 


to 

the 
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Chapter 1 


Overview 


The aim of formal verification is to mathematically prove 
programs are correct. Logical techniques for carrying out such 
proofs are first informally developed and then emboddied in 
automatic verification systems. The latter provide machine 
support for the often tedious proofs. They keep track of the 
status of the verification (what has been and what remains to be 
proved) and aid in the deductions through the use of automatic 
formula simplification and logical decision procedures for parts 
of mathematics. For those parts of proofs which can not be 
machine supplied the verification system acts as a stern, 
humorous proof checker thus guaranteeing that no step has been 
omitted from the human supplied proof through negligence. 

A prerequisite to proving programs correct is agreement as to 
how correctness should be expressed. Although we discuss an 


alternative app 

roach 

( "The Programming 

Logic 

Approach" ) 

later 

we 

will focus on 

the 

classical Pre 

and Post 

Condition 

form 

of 

specif icati on. 

This 

takes the form of 

Hoare 

sentences 

of 

the 


form 


1 


1 


(p) s (q) 


where S is a section of code and p, q are formulas in a 
mathematical language (as contrasted with a programming 
language). An example of such a language is first-order logic 
which uses the quantifiers "all" and "some’' in addition to the 
Boolean connectives. The formulas p and q contain the variables 
which occur in S. The meaning of Hoare sentence is 

If the initial values of the variables satisfy p and 

S terminates then the final values of the variables 
satisfy q. 


Since q may also need to refer 

to 

the initial 

values of 

the 

variables 

they are allowed to occur 

in q as ' x, 

'y, etc. 

For 

example , 

if the specification 

of 

S is that 

it places 

the 


exponential of x and y in z and that it not change x or y, where 

x, y, and z are integer variables, then the correctness condition 
takes the form 

{y >= 0) S {z = 'x A 'y & x = *x & y = ’y). 

Here we use * for exponentiation and have added the Pre Condition 
that y be non-negative since we have decided that the program 
need only be correct on those values. Alternatively we could 
change the specification by replacing p by "true” (this means we 
are assuming nothing about the initial values) and replaced ,f z = 
'x A 'y" in the Post Condition by 

’ y >= 0 -> z = 'x A ’y. 
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The two specifications express the same correctness condition but 
the former is to be preferred since its Pre Condition would be 
available during the course of the proof of correctness. 

In order for the above example to be meaningful one assumes 
that the programming language does not contain an exponential 
operator, the program S computes A using a loop. This 
illustrates the need for the mathematical language to have more 
operators than the programming language. Quantifiers are also 
useful as in the following Post Condition which says that final 
value of y is the first prime after the input 'x 


Prime ( y ) & 'x < y & 


some z ( ’x < z & Prime(z) & z < y) 


where 


is the negation symbol and Prime(y) itself need! 


quantifiers 


y > 1 & all u, v ( y = u * v -> u = y or v = y ). 

The formal verification of mathematical software (software 
which uses floating point arithmetic) poses special problems not 
encountered in classical program verifications such as those 
mentioned above in which only discrete data types such as 
integers are considered. These problems arise from the 
differences between the physical representations used in machines 
and the ideal, mathematical entities they are based on. 

When verifying integer arithmetic programs one pretends that 
the machine integers are exactly the same as the ideal integers. 
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This pretense is, strictly speaking, not valid, since there are 
only finitely many machine integers while there are infinitely 
many ideal integers. The pretense is acceptable, however, for 
two basic reasons: 

1. The machine integer operations are the same as the 

corresponding ideal operations as long as neither the 
arguments nor the result are too large. Thus, as long as 
overflow does not occur, the behavior of a program which 
uses only integer arithmetic is the same as if it were 
using ideal integers. 

2. The verification of programs which use only integer 

arithmetic is thought of as applying only when the program 
runs to termination without an overflow occurring. This 
often includes most of the uses of the program that the 
programmer is interested in. Thus by considering overflow 
as a form of non— termination one can identify the logic of 
the program with the logic of mathematics. 

Because of the first point we are free to use the same symbol 
in the progamming and mathematical languages for the arithmetical 
operations so that 

{true} z := x * y (z = x * y) 

is a valid Hoare sentence. Even if the program overflows it is 
still a valid Hoare sentence because the latter only specifies 


1 
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what happens if termination occurs. On the other hand even 
without overflow the above Hoare sentence is not true if x, y , 
and z were floating point variables. While it is true that 
treating integer overflow as non-termination is a kludge it is 
interesting to note that Ada with its user supplied exception 
handlers has reopened the question of how to properly write the 

Post Conditions of integer programs so that all possibilities are 
specified. 

Thus "pretending” that floating point arithmetic is the same as 
ideal real arithmetic is not acceptable. Floating point 
operations deviate from ideal real number operations through 
roundoff and underflow as well as overflow. It is true that 
floating point operations are the same as the ideal real 
operations when roundoff, underflow and overflow do not occur, 
but such situations are infrequent. Thus if we verify 
mathematical software by "pretending" that the floating point 
operations are exact, and adopt the convention that the 
verification does not apply to runs in which roundoff, underflow 
or overflow occur, then the verification will not apply to most 
of the runs we are interested in. 

For these reasons, we would prefer to verify mathematical 
software on the basis of a model of floating point operations 
which is closer to what is actually done in machines. Several 
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such models have been presented ([1], [2], [3]), along with lists 
of axioms which they satisfy. Unfortunately, these axioms 
systems are either too complex, in which case they are difficult 
to use, or they are simple but too weak to do adequate analysis 
of software. In addition, verification using these axiom systems 
usually requires the verifier to formulate and prove elaborate 
■statements about the nature and magnitude of various sources of 
numerical error thus confusing a logical problem with a numerical 
analysis problem. While numerical analysis is important we feel 
that correctness is a separate issue as we will show. Even the 
above mentioned systems with simple axioms often involve proving 
theorems which are quite complicated. 

The first major problem is how to express the specifications. 
As we pointed out above z = x * y is not a proper Post Condition 
for the program fragment z := x*y when x, y, and z are floating 
point variables. This problem becomes aggravated further when we 
attempt to use the Verification Condition (VC) approach to 
program verification on mathematical software. One of the 


1. Mansfield, R., h_ Complete Axiomatization of Computer 
Ar i t hme t i c ^ to appear in the Journal of Mathematics and 
Computation 

2. Holm, John, Floating Point Arithmeti c and Program Correctness 
Proofs , Ph. D. tHesis , Department of Computer Science, Cornell 
University , August 1980 

3. Brown, W, S., _A Simple but Realist ic Model of Floating-Point 
Computations , Computing Science Techinical Report No. 83, Bell 
Laboratories, April 1981 
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difficulties encountered, in using the VC approach on integer 
programs is that the VCs generated for even simple programs are 
logically complex. This makes it difficult to prove (or even 
understand) the VCs. What is perhaps even more important, it is 
difficult to determine what is wrong with a program when a VC is 
found to be false. This problem becomes even worse when 
complicated axiom systems like those formulated for floating 
point arithmetic are used. In addition, just as it is difficult 
to formulate appropriate specifications for mathematical 
software, it is also difficult to formulate the appropriate 
embedded assertions and loop invariants for such software 
required by the VC approach. 

This report addresses the above the above problems in two ways: 

1. A new paradigm for modelling and axiomatizing floating 
point arithmetic, which we will call the asymptotic 
paradigm, is presented. This paradigm yields a simple, 
intuitive axiom system which is strong enough to do 
non-trivial analysis of mathematical software. 

2. This paradigm is applied in the context of two different 
approaches to program verification. One is the VC approach 
and the other is an alternative approach, called the 
Programming Logic approach, which is designed to avoid some 
of the problems which have arisen from the VC approach. 
Our discussion of the VC approach is more definitive, 


1 - 7 



reflecting the maturity of the technique; the discussion of 
the Programming Logic Approach is more tentative. 
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Chapter 2 


The Denotetional Semantics of the Asymptotic Paradigm 


2.1 Modelling Fl oating Point Arithmetic: The Cropping Function 

Our starting assumption is that the machine implemented 
floating point operations can be represented as the ideal real 
number operations followed by rounding. The operation of 
rounding is modelled by a cropping function, CR, from the real 
numbers (denoted by R) to R. The range of CR represents the 
machine real numbers, sometimes called the model numbers. This 
was the approach taken in the Mansfield and Holmes work cited 
previously and is consistent with the proposed IEEE standard for 
floating point arithmetic [1]. 

We „ill assume CR satisfies the following axioms, hereinafter 
referred to as "the cropping function axioms": 


- Axiom 1: The range of CR is finite. 


1. A Proposed Standard for Binary Floatine 
Draft 10.0 of IEEE Task P 754, "Dec. 1982 


Point Artihmetic . 


2 


1 



Axiom 2: CR(CR(x)) =.CR(x) 


- Axiom 3: CR(0) = 0 

- Axiom 4: [x <= y <= z & CR(x) = CR(z)] -> CR(x) • CR(y) 

The first axiom expresses the fact that there are only finitely 
many machine real numbers. The second axiom says that the result 
of a rounding operation (i.e. a machine real number) is 
unaffected by further rounding. Note that the second axiom 
implies that the range of CR and the set of fixed points of CR 
are the same. The third axiom says that 0 is a fixed point of 
CR, i.e. that 0 is a machine real number. The fourth axiom says 
that if x and z round to the same number and y is between x and z 
then y rounds to the same number as x and z. 

One axiom which was included by Mansfield and Holmes which we 
do not include is that CR is an odd function, that is, that 
CR(-x) = -CR(x). We do not want to require that CR be odd, since 
this would rule out rounding towards plus infinity and rounding 
towards minus infinity, two rounding modes which the proposed 
IEEE Standard would require to be supported. 

Note that cropping function axioms 2 through 4 are expressed in 
first order logic, whereas the first is expressed in English. 
This is because the concept of "finite" cannot be expressed in 
first order logic without adding concepts from set theory. In 
order to perform truly formal program verification, we must 
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eventually express the first axiom more precisely. This will be 
dealt with later. As usual when stating axioms in first order 
logic there are implicit universal quantifiers in front of the 
formulas displayed as Axioms 2 through 4. 

The cropping function axioms are consistent with the four 
rounding modes which the proposed IEEE Standard would require to 
be supported, namely rounding to the nearest machine real number, 
rounding towards 0, rounding towards plus infinity and rounding 
towards minus infinity. They are also consistent with rounding 

away from zero, a mode which is not mentioned in the proposed 
IEEE Standard. 

At this point we can derive some useful consequences of the 
above axioms: 

Theorem 1: CR is monotone, i.e. x <= y -> CR(x) <= CR(y) 

- Theorem 2: There is no machine real between x and CR(x). 

The proofs of these statements are in Appendix A. They do not use 

Axiom 1 and the only facts about the reals which are needed is 
that <= is a linear order. 

Note that the second statement does not imply that there is no 
machine real that is closer to x than CR(x). Again, we do not 
wish to require this because the proposed IEEE Standard would 

require other rounding modes than rounding to the nearest machine 
real . 
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The above cropping function axioms capture certain qualitative 
properties of CR. Other, quantative properties are captured by 
the error axioms, which are given below. These axioms are 
expressed in terms of five additional constants, M-, m-, m+, M+ 
and e. M+ and M- are the positive and negative overflow 
threshholds respectively; m+ and m- are the positive and negative 

underflow threshholds respectively. e is the relative error 
bound. 

- Axiom 5: M- <= m- < 0 < m+ <= M+ 

- Axiom 6: [x < 0 & CR(x) = x] -> M- <= x <= m- 

- Axiom 7: [x > 0 & CR(x) = x] -> m+ <= x <= M+ 

- Axiom 8: 0 <= e < 1 

Axiom 9: [M- <= x <= ra- or m+ <= x <= M+ ] -> |CR(x) - xj <= 
e* | x | 

- Axiom 10: x is an integer & [M- <= x <= ra- or m+ <= x <= M+] 
-> CR(x) = x 

The first error axiom just states the signs and the order of 
the thresholds. The second and third error axioms say that the 
negative machine reals are bounded by M- and m- and the positive 
machine reals by m+ and M+. The fourth error axiom gives bounds 
on e, and the fifth says that e is a bound on the relative error 
in the cropping function for numbers between the threshholds. 
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The reason for having separate overflow and underflow threshholds 
for positive numbers and negative numbers is that it makes it 
easier to model rounding towards plus infinity and rounding 
towards minus infinity. These two rounding modes are not 
symmetric with respect to zero, and so we need to be able to 
treat the behavior on either side of 0 separately. The last 
axiom guarantees that integers in range round to themselves. We 
discovered the need for this axiom only after we began proving 
programs which had integer literals in the text. 

In terms of CR we make the following definitions: 

- Definition 1: MR(x) iff CR(x) = x, i.e. x is a machine 

real . 

- Definition 2 : x ++ y = CR(x + y). 

Definition 3 : x ** y = CR(x * y). 

Definition 4: x — y = CR(x - y). 

Definition 5: x // y = CR(x / y). 

- Definition 6: x AA y = CR(x A y). 

i 

We assume that ++, ** , etc. applied to machine reals model the 
machine operations. Previous axiom systems for floating point 
arithmetic were stated in terms of ++, **, etc. Unfortunately, 
these operations satisfy peculiar properties (e.g., ++ is 

commutative but not associative) so that verification in terms of 


2 
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then, becomes complicated.. In particular, the approach through 

the logic of ++, **, etc. forces the verifier to state and prove 

complicated error statements. For example, while the Hoare 
statement 

{true} z := x * y { z = » x ** » y ) 

is true where x, y, and z are floating point variables it would 
take a close analysis of errors to show that 

{true} S [z = x AA n} 

is true where z and x are floating point variables, i and n are 
integer variables, and S is 

i := 0; 

DO * WHILE(i < n); 
z : = z * x ; 
i 1=1+1; 

END;. 

The point is: when does 

CR(x * x * .... * x ) = [CR (x) ** CR(x) ] ... ** CR(x) 

where the products on both sides are n-fold? Our use of i n 

the program t^xt for machine multiplication follows normal 
convention; to be precise we should really use ** (although the 

use of + between the integer variables is not objectionable) 
which from now on we will. 

Actually are we really interested in Post Conditions which 
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contain the machine 


operations ++, 


A A 


etc.? These 


operations are implementation details 


** 

and should not enter into 


formal specifications for Higher Order Language (HOL) programs. 
But then, what kinds of statements do we want to prove about 


mathematical software, 
specifications? 


what are the appropriate 


In the next section, we begin discussing our solution to this 
problem. We center our discussion on the asymptotic behavior of 
a program, that is, the behavior as the precision of the floating 
point arithmetic used increases. We will be able to show that 
the above program correctly implements "x*n" because as precision 
increases the output tends to x*n in the limit. Our logic will 
enable us to prove this without having to actually carry out the 
limiting constructions. The latter are in the meta-theory which 
justify our axioms and need not be understood by the prover 
(human or machine) although such understanding would often 
facilitate the finding of proofs. 


2.2 Appropriate Specifications for Mathematical Software 


2*2.1 A Motivating Example 

Suppose we wanted to write a program whose informal 
specification was "Add up the entries in a one-dimensional array 
A of machine reals with length 3". We might produce something 
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like the following: 


I : = 1 • 

SUM :=’o.O; 

DO WHILE(I <= 3); 

SUM := SUM ++ A ( I ) ; 
I :*» I + 1; 

END; . 


Note that we have adopted the conventi 
in the program text for the floating poi 
we now wanted to formally verify that 
The first thing we would have to do 
informal specification into a formal sta 
true when the program terminates. Sine 
third assignment statement is machine a 
to have 


on of using ++, ** , etc. 
nt opperations. Suppose 
the program was correct, 
is translate the above 
tement of what should be 
e the addition in the 
ddition, we cannot expect 


SUM = A( 1 ) + A( 2) + A( 3 ) 

on termination. We could instead say that we want 

SUM = [ A( 1 ) ++ A ( 2 ) ] ++ A(3) 


( 1 ) 


( 2 ) 


at termination. Although this statement is true when the program 
m question terminates, it is not the correct formalization of 
the informal specification. To see why, imagine that we had 
written the program so that it added up the entries of A in the 
opposite order. Such a program would meet the informal spec as 
much as the above program does, but it would not necessarily meet 
(2). We could do an error analysis of the program to obtain some 
kind of specification like 
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|SUM - (A(l)+A(2)+A(3)) | <= eps*(A(l)+A(2)+A(3)) ( 3 ) 

where eps is some expression which depends on error constants 
associated with the machine's rounding. This has the same 
problem as the previous proposal: a different program might not 
meet the same error analysis yet still meet the informal 
specification. Furthermore, both (2) and (3) are examples of 
basing the spec on the program rather than vice versa. We need a 
specification which is independent of the program. 


2.2.2 The Asymptotic Concept 

Actually, the first of the three answers above is the closest: 
we wanted the program to give us the sum of the entries of A. We 
didn't really expect it to give us the exact sum, however, but 
rather something ’’close" to the exact sum. What do we mean by 
"close"? We don't really mean "as close as the machine can get", 
i.e. we don’t mean that SUM should be the closest machine real 
to the actual sum. Nor should we expect SUM to be CR[A(1) + A(2) 
+ A( 3 ) ] . What we really need is some formalization of the concept 
of close" and a logic to reason about this concept. 

We could take (3) as our definition of "close", but with a 

pro-determined eps rather than one derived from the program. 
There are two problems with this: 

1. The above program, running on a given machine, might not 
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meet the condition we set for the eps we chose because the 
machine’s arithmetic is not sufficiently precise. This is 
a problem with the machine, not the program, and the 
program really should not be called incorrect. 

2. A program might have errors in it and still meet this kind 
of spec because the magnitude of the error on the 
particular machine being used was much smaller than the eps 
we chose. Such an error might suddenly show up if a 
smaller eps was used. 


In the first cast, the program fails to meet the spec but would 

if the machine were more precise. Iu the second case, the 

program meets the spec but would f ,• ^ .. 

v t- out would fail if the spec were more 

demanding. What we really want is that for any ’’degree of 

precision in the Post Condition, there is a "sufficiently 

precise machine" such that the result of running the program on 

that machine meets the required degree of precision. 


Another way of saying this is to say that as the precision of 
the machine increases, the precision of the result of running the 
program on that machine increases. It is our point of view that 
whereas numerical analysis of the program shows how the precision 
of the result increases as that of the machine increases, a 
logical analysis of the program can determine that there is such 
an increase and this is what we shall mean by correctness. 

We formulate this concept by considering the asymptotic 
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behavior of the program over a series of machines of increasing 
precision. We will require that "correctness” be a limiting 
concept even though we only intend to execute the program on a 
single machine with fixed precision. We make these remarks 
precise by introducing new machines which are the "limits" of 
sequences of machines of increased precision. These limit 
machines don't operate over the ideal reals but over non-standard 
models of the reals. These models contain all the ideal reals 
together with other numbers which are "infinitesimally" close to 
the standard, ideal reals. What do these new numbers correspond 
to? Essentially to a particular sequence of machine 
approximations of increasing precision. A different sequence 
converging to the same ideal number would give rise to a 
different non-standard number. A fixed program P can be run with 
any of these inputs. Consider such a program P and a 
mathematical function f from R to R. If it's the case that 
whenever x and y are inf initesmally close to the standard z we 
get that P(x) and P(y) are inf initesmally close to f(z) then we 
can say that P correctly implements f. The Post Condition will 
have the form {result == f(x)} where == is our symbol for 
11 inf ini tesimally close”. 


2>3 Non-Standard Anal 1 


This section is a brief exposition of the relevant mathematical 
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notions for understanding . non-standard analysis. We assume that 
the reader is familiar with the terminology of set theory. We 
begin by reviewing the language and interpretations of 
many-sorted first order logic. 

2.3.1 Many-sorted First Order Logic 

A language L of many-sorted first order logic consists of the 
following : 

- A set of sort or type symbols. Sort. 


- A set of 

constant 

symbols. Con. 

Each 

constant 

symbol 

c 

in 

Con has a 

sor t . 







- A set of 

function 

symbols, Fun. 

Each 

function 

symbol 

f 

in 

Fun has a 

signature 

<sl , . . , sn> of 

sort 

symbols . 




- A set of 

relation 

symbols, Rel. 

Each 

relation 

symbol 

R 

in 


Rel has a signature <sl, ... sn> of sort symbols. 


- A symbol for the identity relation: =. 


For each sort s there is an infinite list of variables of 
that sort. 

The symbols for the Boolean connectives: &, or, ->, iff, ~ 

and the Boolean constants "true" and "false". 


- The symbols for the quantifiers: all, some. 
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Punctuation marks: 


V', etc. 

Using this datum one can define the terms and formulas. Each 
term has a sort. Both syntactic sets are defined recursively. 

A variable or a constant is a term of the appropriate sort. If 
f is a function symbol of signature <sl, .. , sn> and tl, .., 

tn-1 are terms, ti of sort si, then f(tl, ... tn-1) is a term of 

sort sn. 

If tl and t2 are terms of the same sort then (tl = t2) is an 
atomic formula. If R is a relation symbol of signature <sl, 
sn> and tl, .., tn are terms, ti of sort si, then R(tl, .., tn) 
is an atomic formula. The Boolean constants are atomic 
formulas. If F and G are formulas then so are (F & G), (F or G), 

( ^ U), (F iff G), ( F). If F is a formula and x is a variable 

then 

some x F 


all x F 


are formulas. For convenience the above is frequently 
abbreviated and condensed. For example we won't assign sorts to 
variables but write 


some x:s F 

where s is a sort symbol and x is an unsorted variable. 
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The scope of a quantifier is the smallest formula containing 
it. An occurence of a variable is called bound if it is within 
the scope of a quantifier on the same variable. Otherwise the 
occurrence is called free. A formula F without any free 
occurrences of variables is called a sentence. 

An interpretation of I the language consists of a non-empty set 
I(s) for each sort symbol s; an object 1(c) in I(s) for each 
constant c of sort s; a function 1(f) from I(sl) x ... x I(sn-l) 
to I(sn) for each function symbol f of signature <sl, ... sn>; a 
relation I(R) which is a subset of I(sl) x .. x I(sn) for each 
relation symbol R of signature <sl, sn>. 

Given a sentence F and an interpretation I it is either true or 
false that F holds under I. We write I |= F if F is true in I. 
For each interpretation I we define its theory Th(I) to be 

{F: I |= F) 

] and for every sentence F we define its model class, Mod(F) , to 
be 

U: I |= F). 

] More generally, if K is a class of interpretations of L then 
Th(K) is the intersection of all the Th(I) such that I is in K, 
and if S is a set of sentences then Mod(S) is the intersection of 
all Mod(F) such that F is in S. Th(K) is the set of all sentence 
true in all structures in K and Mod(S) is the set of all 


2 


14 


structures whcih satisfy every sentence in S (i.e., they are the 

models of S). In terms of these we can define the logical 
consequence operation, Cn(S), 

Cn(S ) = Th(Mod(S) ) . 

Cn(S) is the set of all sentences true in all models of S. 
Although our definitions are completely semantic and seem to 
require a great deal of set theory this is not the way 
mathematicians actually construct Cn(S). If S is the set of 
axioms for Euclidean geometry one doesn't search through all 
models to determine whether the Pythagorean theorem is an actual 
theorem. Instead one proves the latter from the former set of 
axioms. Fortunately, first-order logic has a complete set of 
proof rules which can be mechanized. If a machine can be 
constructed to automatically enumerate the set S then another can 
be constructed to automatically enumerate Cn(S). Unfortunately, 
this theoretical result is not often as useful as it sounds since 
Cn(S) may be listed in no particularly significant order. To get 

good results one needs interactive theorem provers to guide the 
generation of Cn(S). 

In certain cases, Cn(S) is not only enumerable it is decidable; 
that is there is a program which when supplied with a sentence F 
determines in a finite amount of time whether F is in Cn(S). This 
is true for the theory of real closed fields described below. 

Two interpretations, II and 12, are called elementary 
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equivalent if Th(Il) . = Th(I2). This means they are 
indistinguishible as far as the expressive power of the language 
L. If F is a formula then there is no meaning to I J = F (e.g. if 
F is 



all x (all y (x <= y)) 

then it doesn't mean anything to say F is true or false in say 
the standard structure over the reals although in this case the 
first sentence above is true and the second false. On the other 
hand if F has free variables xl , .. , xn and al, an are 

objects from the underlying set given by I (we are assuming for 
simplicity a single sort) then 

I |= F[al, an] 

does make sense. For example if I and F are structure and 
formula mentioned previously then 

I |= F[5,6] 

is true while 
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. I I- F[ 6 , 5 ] 


is false. 

II is called a subinterpretation of 12 if Il(s) is a subset of 
I2(s) for each sort symbol, 11(c) = 12(c) for each constant 
symbol, 11(f) and I1(R) and the restrictions of 12(f) and I2(R) 
for each function and relation symbol. FOr example, one usually 
things of the integers with +, *, o, 1, and <= as a 
subinterpretation of the reals with the same operations (this is 
the basis of the standard overloading of arithmetic symbols.) A 
stronger relation between II and 12 is that of "elementary 
subsystem" where in addition to being a subinterpretation we have 

II |= F[al , ... an] iff 12 |= F[al, .., an] 

] for all formulas F with free variables xl , xn and all al , 
.., an from the sort sets of II. This relation implies that II 
and 12 are elementary equivalent but is much stronger. 

The basic language which talks about the reals includes the 
constant symbols 0 and 1, the function symbols +, *, and the 
relation symbols like <=. One should distinguish these syntactic 
objects from the actual operations given in an interpretation. 
Although in practise one tends not to since to do so would 
require a complicated meta-language. The standard model for this 
language is the usual interpretation. This language is referred 
to below as the language of real closed fields. Real closed 
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fields are special kinds of fields defined in algebra in a purely 
algebraic way. They state that 0, 1, + , *, <= form an ordered 
field in which every positive number has a square root and every 
odd degree polynomial has a zero. It is a classic result of 

that the exact same sentences are true in the standard 

model as are true in any real closed field, that is all real 
closed fields are elementary equivalent. Furthermore, the set of 
first order consequences of the theory of real closed fields is 
decidable. Real closed fields is an example of a single sorted 
theory. Adding a predicate N(x) to the language which singles 
out the integers destroys the decidability of the theory. 

2.3.2 Introduction to Non-Standard Analysis 

Calculus was developed in the eighteenth century based on the 
notion of inf initesmals . These were positive entities dx smaller 
than any actual postive real but not 0. Furthermore, they obeyed 

the laws of ordinary real arithmetic so that one could carry out 

ordinary algebraic manipulations like 

y = x A 2 

y + dy = (x + dx) A 2 

(x + dx ) A 2 = x A 2 + 2 * x * dx + ( dx ) A 2 
dy = 2 *x * dx + (dx) A 2 
dy/dx = 2 * x + dx. 

In particular the deriviative, dy/dx, was the actual quotient of 
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two inf mitesmals . In terms of our previous discussion we would 
say that these extended reals formed a real closed field. 

Attempts in the nineteenth century to justify working with 
these extended reals were not successful and a different approach 
and proof technique in terms of limits was adopted instead (the 
so-called epsilon/delta method.) 

In the early 60 ' s logicians showed how to justify working with 
actual inf initesmals . This accomplishment consisted of two 
parts. First, models were constructed of domains containing 
inf mitesmals . The proof of the existence of these models 
requires non-constructi ve techniques (the axiom of choice) and as 
a result although they are conceivable the models are not quite 
visible. This contrasts with the standard model of the reals 
which is always identified with the visible continuum. Owing to 
twentieth-century advancements in basic physics, tangibility and 
visiblity of models is no longer considered a necessity although 
it does make a subject less accessible to the non-ini tiated . 

In addition to making models, various axiom systems reflecting 
how the inf mitesmals in these models behave were constructed. 
The models prove that the axioms are consistent but all proofs 
using inf initesmals can be carried out completely from the axioms 
without any concern for the models. Again this is similar to the 
method used in modern physics. Students are taught how to 
manipulate the formalisms of quantum mechanics before they learn 
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(if they ever do) how to construct the underlying Hilbert spaces 
which justify the formalism. In the case of calculus, freshman 
textbooks have appeared using these axioms [2]. Freshman do not 
know enough mathematics (in particular, modern algebra: groups, 
rings, and fields) to follow the actual existence proof of the 
models. They just learn how to use the axioms. The axioms are 
accepted because the notion of infinitesmal is very intuitive (it 
is used in many older, "non-rigorous" engineering texts) and the 
student sees that the axioms presented do capture some properties 
of his intuitive understanding of the inf initesmally small. 
Furthermore, they rely on their teacher's word that the axioms 
will be justified in advanced courses* 

Our proofs of programs can also rely on such axioms without the 
need to go through the construction of the models. On the hand, 
to understand why non-standard analysis is relevant to machine 
arithmetic one needs to be able to understand these 
constructions. After the justification is made and accepted one 
can just work formally using the axioms. 

A first approach to building a real closed field with 
inf ini tesmals is to consider the set U of all sequences <ai> of 
reals. If ai is a for all i then <ai> can be identified with a 
and represents a standard real. Perhaps sequences <ai> of 


2. Keisler , J., Foundations of 

Prindle-Weber-Schmi dt , 1976 


Infinitesmal Calculus . 
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postive reals 


converging, to zero can play the role of 
inf initesmals? It is not difficult to interpret the constants 0 
and 1 and the function symbols + and * in this set. For example, 

<ai> + <bi> = <ai + bi> 

<ai> * <bi> = <ai * bi>. 

The resulting structure is a ring but not a field. We use the 
term ’’ring” to mean a commutative ring with a 1 distinct from 0. 
To see why U is not a field consider the elements defined by 

a: *- = if i is even then 0 else 1/i 
bi = if i is even 1/i else 0 

then <ai> = 0 and <bi> ”= 0 but <ai> * <bi> « 0 which can not 
happen in any field. In fact <ai> and <bi> are inf initesmals in 
our structure, call them dy and dx and it is thus impossible to 
form the quotient dy/dx # 

In the above example one would like it if either <ai> or <bi> 
were to be considered 0. But if all sequences converging to 0 
were to be considered 0 there would be no inf initesmals ! What 
does it mean that some <ci> in U other than 0 is to be considered 
0? One way to make this precise is to find an equivalence 
relation E on U in which <ci> and 0 are equivalent and to replace 
U by the collection U/E of equivalence classes. Such 
constructions are common in algebra. If the equivalence relation 
E satisfies the congruence axiom (sometimes called the 
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"substitution Axiom”) 


xEy&zEw->x + z = y + w& x*z = y*w 

then one can define + and * on the equivalence classes and still 
have a ring. It is shown in algebra that congruence relations on 
a ring are in one-to-one correspondence with the ideals of the 
ring: if E is such a congruence then J(E) = (c : c E 0) is the 
corresponding ideal and if.J is an ideal then E(J) <= { (x, y) : x 
- y in J} is the corresponding congruence relation. 

The question thus becomes: find an ideal of U containing dx = 
<ai> but not containing dy = <bi> (or vice versa). Now U is a 
collection of real sequences that is functions from the set £ of 
natural numbers to the set £ of reals. 0 in U is that function 
which is always 0. Suppose we relax this condition somewhat and 
let J be the set of <ai> such that ai is eventually 0. J is an 
ideal but unfortunately this ideal doesn't solve our paroblem 
since the dx, dy defined previously are not in our ideal. On the 
other hand J does suggest an approach namely what makes J an 
ideal? We can state the definition of J in the following way: 
Let F be the collection of all cofinite subsets of N (i.e. A is 

in F iff N - A is finite.) Then J is the set of all <ai> such 
that {i : ai = 0} is in F. 

More generally , let F be a collection of subsets of and 
define J(F) to be the set of <ai> such that {i : ai = 0} is in F. 
What properties must F have to ensure that J(F) is an ideal? 
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Recall that a non-empty subset J of a ring is an ideal if 
II: a in J -> a * b in J, where b is any ring element 

12: a, b in J -> a + b in J. 

Suppose a is in J(F) so that Z(a) = {i : a i in J} is in F. For 

any b in U, Z(a * b) is a superset of Z(a) so that if F has the 
property 

FI: A in F and A subset B — > B in F 

then II is true. Now suppose a and b are in J so that Z(a) and 
Z(b) are in F. Now Z(a + b) is a superset of [Z(a) intersect 

Z( b) ] so if the non-empty F has the property 

F2: A and B in F -> (A intersect B) in F 

then FI and F2 imply J(F) is an ideal. We must watch out for one 

case however. The ideal consisting of the whole ring will 

collapse everything to 0. An ideal not equal to the whole ring is 

called proper. The improper ideal is the only ideal containing 

1. It is is given by an F containing all the subsets of N,. By FI 

F contains all the subsets just when it contains the empty set. 
Thus we add the condition 

F3: F does not contain the empty set. 

Non-empty F satisfying FI and F2 are called filters. If in 
addition F3 is satisfied the filter is called proper. 
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We have actually proved. the following theorem. 

Theorem: Suppose <V , +, *, 0, 1> is any ring and M is any set. 

Let U be all functions from M to V. Interpret 0 in U to be the 

constantly 0 function and 1 to be the constantly 1 function. 
Define + and * in U pointwise, i.e. f + g is that function h 
such that h(m) = f(m) + g(m). The resulting structure is a 

ring. Suppose F is a filter on M (i.e. F is a collection of 
subsets of M satisfying FI and F2.) If J(F) is defined by 

{f : Z(f) is in F} 

where Z(f) is 

{m : f(m) ■ 0} 

then J(F) is an ideal in the ring U. J(F) is a proper ideal if F 
is a proper filter. 

Now when do we get a field? It is a classic theorem of ring 

theory that a quotient ring is a field exactly when the ideal J 

is maximal. J is maximal means that it is not contained in any 
larger proper ideal. There is a corresponding notion for 
filters: F is maximal if it is not contained in any larger proper 
filter. If F is a maximal filter will J(F) be a maximal ideal? 
The problem is J(F) may be contained in some ideal J’ not of the 
form J( F ' ) for some filter F* . (One can easily show that the 
(napping F — > J(F) from filters to ideals is one-to-one and 
preserves inclusions; the problem is is it sufficiently onto so 
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is a field 


as to preserve maximal objects?) If the original V 
then the mapping will preserve maximal objects. 

Theorem: Suppose <V, +, *, 0 , 1> Is a field in the above 

theorem and F is a maximal filter. Then J(F) is a maximal 
ideal. 

Proof. We will use the notation of the previous theorem. What 
we will first show is: 

Lemma: If J' is any proper ideal of U (and V is a field) then 
there is a proper filter F‘ such that J' is contained in J(F’). 

Proof of Lemma: Let J' be a proper ideal of U. We will assume 
that V is not of characteristic 2 (i.e., 1 + 1 0). The Lemma 

and Theorem are still true in this case but require a separate 
proof and we are really only interested in the case where V is R > . 
Let F- be the set of Z(a) for s in J' . F’ will be the required 

proper filter. Suppose B is a superset of some Z(a). Let b be 
defined by 

b(m) = if m in B then 0 else 1. 

Since J’ is an ideal II shows that a * b is in J’ but it is easy 
to see that Z(a * b) is B. This proves FI. Now suppose a and b 
are in J’ . We want to show that (Z(a) intersect Z(c)) is in F f to 
prove F2. What we need is a c with Z(c) equal to <Z(s) intersect 
Z(b)). Since V is a field we can define 
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a'(i) = if a(i) « 0 then 0 else l/a(i) 
b'(i) = if b(i) = 0 then 0 else l/b(i). 

since J' Is an ideal a" = a * a' and b" - b * b' ore in J'. But 
a" is 0 on Z(a) and 1 elsewhere and similarly with b". Let c «= 
3 + b . Since 1 + 1 = 0 we have that Z(c) is the 
intersection of Z(a) and Z(b). So F’ is a filter. Why is it 
proper? If the empty set were in F’ then it would be Z(a) for 
some a in J’ . Defining a' and a" = a * a' as before shows that 1 
Is in J’ contradicting the fact that it is a proper ideal. Now 
we know that J(F') is a proper ideal. But from the definition of 
F’ and J(F’) it is easy to see that J(F’) contains J ' . QED . 

Now let us return to the Theorem. Suppose F is a maximal filter 
and suppose J* is a proper filter containing J(F). Construct F’ 
as in the Lemma. Since J’ extends J(F) we have that F is 
contained in F» (this follows from the definition of J(F) and 
F'). But F was maximal so F = F' . But by the lemma J(F') = J(F) 
extends J’ which shows that J’ = J(F) and J(F) has no proper 
extension among the proper ideals, i.e. it is maximal. QED. 

Thus we see that maximal filters allow us to define extensions 
of the reals which are fields. What about the order relation 
which plays such a crucial role in analysis? If we define <= on 
the ring U which is a product of J?'s by . 

<ai> <= <bi> iff (i ; ai <« bi } in F 
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then the congruence relation, E(F), defined by the proper 

filter F (this is E(J(F)) using our previous notation) satisfies 
the substitution axiom 

a E(F) b & c E(F) d & a <= c -> b <= d . 

Thus one can define <= on the quotient ring. Will it linearly 
order this ring? Not necessarily. It is easy to show that <= on 
U is a partial order. The problem is dichotomy. Fortunately, 
everything goes right if F is a maximal filter. To see why we 
quote without proof the following theorem on filters* 

Theorem: Let M be a set and F a proper filter of subsets of M. 
Then the following are equivalent: 

1* F is a maximal proper filter; 

2. For all subsets A of M either A or M - A is in F; 

3* If (A union B) is in F then either A or B is in F* 

In any of these case F is called an ultrafilter* 

To apply this theorem given a and b in the quotinet ring (which 
is a field) let 

A ® {i : ai <= bi ) 

' B = ( i : bi <«= ai ) . 

Now A union B is all of M so it is in F. Since F is an 
ultrafilter either A is in F or B is; this means either a <= b or 
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b <= a. In a similar way. one shows all the other axioms for 

linear order. In fact more is true. The quotient structure is a 
real closed field! To explain why we consider the general 
ultraproduct construction. 

2.3.3 Filters and Ultrafilters 

To make our discussion self contained we repeat some of our 
previous definitions and theorems. 

Given a set I, a proper filter over I is a non-empty set F of 
subsets of I which satisfies the following axioms; 

1. If S is an element of F, T is a subset of I and S is a 

subset of T, then T is an element of F. 

2. If S and T are elements of F, then S intersect T is an 

element of F. 

3. The empty set is not an element of F. 

Informally, a filter is a collection of "large” subsets of I. 
If F is the improper filter then all subsets are "large". 

ultraf ilter is a proper filter that is not a proper subset 
of another proper filter, i.e. a maximal filter. Ultrafilters 

can be characterized axiomatically by adding to the above axioms 
the axiom 


If S is a subset of I, then either 
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S is an element o.f F or I - S is an element of F 


By an argument which uses the axiom of choice in the form of 

Zorn's Lemma, every proper filter is a subset of some 

ultrafilter. 

A non-empty collection G of subsets of I is said to have the 
finite intersection property iff 

For every finite subset {SI, S2, ... , Sn} of G, 
intersection (SI, S2, ... , Sn) is non-empty 

G can be extended to a proper filter iff G has the finite 
intersection property. 

For any i an element of I, 

{ S : S is a subset of I and i is an element of S ) 

is an ultrafilter. Such ultrafilters are called principal 

ultrafilters. Every ultrafilter over a finite set I is 
principal. If I is infinite, then 

{ S : S is a subset of I and I - S is finite } 

is a proper filter. It is called the filter of cofinite sets. 
Any ul traf a Iter containing it must be non-principal. Further, if 
J is an infinite subset of I, then 

{ S : S is a subset of I and I - S is infinite ) U {J) 
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has the finite intersection property. Thus, it is a subset of 
some proper filter, which is a subset of some ultrafilter, and 
this ultrafilter must be non-principal. Thus, any infinite 
subset of I is an element of some non-principal ultrafilter. 

2.3.4 Ultraproducts and Ultrapowers 

Fix a first-order language I. which for convenience we assume is 
single sorted and an index set I. Suppose we have a structure Mi 
for L for each i in I and a filter F over I. The filtered product 
of the Mi over F is a structure U for L defined as follows: 

1. The universe of U is the set of equivalence classes of 

elements of the cartesian product of the universes of the 
Mi's. If <ai> and <bi> are two elements of the cartesian 
product, they are equivalent iff { i in I : ai = bi } is an 
element of F. The fact that this is an equivalence relation 
follows from the fact the F is a filter. 

2 * If k is a constant symbol of L, and ki is its 

interpretation in Mi, then the interpretation of k in U is 

[<ki>] where the square brackets indicate the equivalence 
class. 

3 * If f is an n-ary function symbol of L, and fi is its 

interpretation in Mi, then the interpretation of f in U is 
a function g such that 
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g( [<ali>] , . , . , [<ani>] ) s [<f (ali, . , . ,ani)>] 

This can be shown to be well-defined. Well defined means: 

[<ali>] = [<bli>] & ... & [<ani>] = [<bni>] 

-> 

[<f (ali, . . . ,ani)>] « [<f (bli , . . . , bni)>] 

4. If p is an n-ary predicate symbol of L, and pi is its 
interpretation in Mi, then the interpretation of p in U is 
a predicate q such that 

q([<ali>, . . . , [<ani>] ) iff 
{ i in I : pi (a 1 i , . . . , ani ) } is an element of F 

Again, this is well-defined. 

The Mi from which U is constructed are called the components of 

U. If F is an ultrafilter then U is called an ultraproduct. An 

u ltra power is simply an ultraproduct in which each Mi is the same 
structure. 

2.3.5 Properties of Ultraproducts 

If the universe of each Mi is a fixed set S, then we can define 
a one-one function inj from S into the universe of U by 

inj(x) = [<x>] 

(i.e. inj(x) is the equivalence class of the I - tuple 
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<x,x,...>). The function inj is one-one. If each Mi is the same 
structure, then inj is a homomorphism of structures, and the 
elements in the image of inj are called the standard elements of 
the ultrapower. If F is principal, then inj will be an 

isomorphism. Otherwise, inj will not be onto, and non-standard 
elements will exist. 

We are now ready to state a very remarkable theorem which is 
far from obvious* 

Funda mental Theorem of Ultraproducts ; If A is a sentence of L, 
A will be true in U iff 

{ i in 1 : A is true in Mi } 

is an element of F. If U is an ultrapower then inj is an 
elementary embedding, that is the image of M under inj is an 
elementary subsystem of the ultrapower U. 

Clearly, if extra constant, function or predicate symbols are 
added to L, and an interpretation of the symbols is given in Mi 

for each i in I, these will induce a corresponding interpretation 
in U. 

The above notions and constructions generalize in a completely 

straightforward way to many-sorted logics. In addition to being 

able to add extra constants, functions and predicates, extra 

* 

sorts can be added at will, and an interpretation of each new 
sort in each Mi will induce a corresponding new sort in U. 
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We formalize the asymptotic paradigm using a certain class of 
ultraproducts. LO is a language with the following sorts, 
constants, functions and predicates; 

- Four sorts; 

1- RR, whose standard interpretation will be the real 
numbers; 

2. NN, whose standard interpretation will be the natural 
numbers (regarded as disjoint from the real numbers 
rather than as a subset of the reals); 

3. NNseq , whose standard interpretation will be the 

functions from the natural numbers to the natural 
numbers; 

A. RRseq , whose standard interpretation will be the 

functions from the natural numbers into the real 
numbers ; 

The constants, functions and predicates of the language of 
real closed fields, applied to the sort RR and any other 
symbols for real objects which we might need; 
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The constants! functions and predicates of the language of 
integer arithmetic, applied to the sort NN and any other 
integer objects we might need; 

- Nev, a binary function of signature <NNSeq, NN, NN> (this 

represents the function which takes a sequence <ni> and an 
integer j and returns nj); 

- Rev, a binary function of signature <RRSeq, NN, RR> (this 

represents the function which takes a sequence <ri> and an 
integer j and returns rj); 

- CR, a unary function of signature <RR , RR>; 

M-, m-, m+ , M+ and e, constants of sort RR . 

We will uniformly abbreviate Nev(s,i) as s(i) and Rev(t.j) as 
t(j). 

Let I be the set of natural numbers. The Mi are obtained as 
follows: fix sequence <CRi> of functions from R. to R, and 

sequences <Mi->, <mi->, <mi+>, <Mi+> and <ei> of real numbers 

such that each CRi, Mi-, mi—, mi+, Mi+ and ei satisfy the 
cropping function and error axioms, and 

!• <rai+> and <mi-> both converge to 0; 

2. <Mi-> goes to minus infinity and <Mi+> goes to plus 
infinity; 
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3. <ei> goes to 0. 


(i.e., CRi "converges to perfect precision". The fact that the 

various sequences satisfy the cropping function and error axioms 
implies that 

CRl (x) goes to x uniformly on bounded closed intervals) 

Mi is the structure for L0 in which 

1. RR is interpreted as 1?; 

2. NN is interpreted as jj; 

NNseq is interpreted as the set of all sequences of natural 
numbers ; 

4. RRseq is interpreted as the set of all sequences of real 
numbers ; 

5. The real closed field symbols and any additional real 
objects are given their standard interpretations in R^; 

6. The integer arithmetic symbols are given their standard 
interpretations in j[; 

7. Nev and Rev are interpreted os indicated above; 

8. CR is interpreted as CRi; 

9. M-, m-, m+ , M+ and e are interpreted as Mi-, mi-, mi+, Mi+ 
and ei respectively. 
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Let F be a non-principai ultrafilter over I. The resulting 
ultraproduct U is a combination of non-standard model of the 
theory of R, a non-standard model of the integers N, a set of 
hypersequences” of non-standard integers (i.e., sequences <ni> 
where i ranges over both standard and non-standard integers) and 
a set of hypersequences of non-standard real numbers. Call the 
class of all such ultraproducts obtained by the above 
construction NSM. 

2.4.1 Further Symbols 

In this section we define an extension of the language LO. 
First, as before one can define MR(x), ++, **, — , and // in 
terms of CR . We also extend the language by adding a unary 
predicate symbol "std” of signature <RR>. For each U in NSM, 
interpret std in U as the standard elements of U, that is those x 
in U of the form inj(y) for y in R., By an abuse of notation, std 
wj. 11 be used for the standard elements of any of the sorts. In 
addition, add the following defined symbols: 

1. fin(x) iff some y:RR [std(y) & |x| <= y] (”x is finite”) 

2. inf(x) iff “fin(x) (”x is infinite") 

3. diff(x) iff all y :RR [std(y) & y > 0 -> |x| < y] ("x is 
infinitesimal” ) 

4. x =» y iff dif f (x - y) ("x is infinitely close to y") 
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We call the resulting language L. For certain reasons which 
will become apparent later, we wish to distinguish symbols and 
formulas which have an interpretation in each Mi (i.e. symbols 
and formulas of LO) from those which only have an interpretation 
in the U's (i.e., std and any symbol defined in terms of std). 
The former symbols and formulas we refer to as internal, and the 
latter as external. 

2.4.2 Axioms 

Let NSA be the set of all formulas of L which are valid for 
every model in NSM. These are what we wish to consider the 
"asymptotically true formulas". Any sentences which we adopt as 
axioms for the paradigm must be in NSA. The choice of what axioms 
to include is largely experimental; we examine what is needed in 
proofs. The following statements in English summarize the axioms 
which we have been using to date in verifying floating point 
programs. This list is somewhat overexhausti ve , and will be cut 
down as much as possible as future experience in using the 

asymptotic paradigm indicates which are vital and which can be 
dispensed with. 

1. The axioms of real closed fields for RR plus any axiom 
needed for any additional symbol for a real object (e.g. if 
we consider the function symbol exp of signature <RR , RR> 
in the language then we add axioms like 
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all x, y:RR(exp(x + y) = exp(x) * exp(y) 

exp(O) = 1 

2. Axioms for integer arithmetic over N_. 

3. If A(x) is any formula of L with x a variable of type R, 
the axiom which says that if S = { x in R : std(x) & A(x) } 

is non-empty and bounded above by a standard real, then S 
has a least upper bound. 

4. The definitions linking defined symbols to the more 
primitive symbols (e.g. x == y <-> diff(x - y ) ) . 

The cropping function and error axioms. 

6. diff(m-) and diff(m+). 

7. inf(M-) and inf(M+). 

8. dif f (e) . 

9. fin(x) -> CR(x) == x. 

10. Axioms which guarantee the closure of RRSeq, and NNSeq 
under explicit and recursive definitions. 

11. The fact that std in each sort forms an elementary 
subsystem can be given by an axiom scheme. 
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Why is fin(x) > CR(x) == x in NSA? Here is a proof. Fix an 
ultrafilter F and a sequence <CRi> as above. Pick an element x = 

[<xi>] from U such that fin(x). This implies that there exists b 
in R. such that 

|x| < in j (b) 

which means 

J - { i in I : | xi | < b } is in F 

which implies that J is infinite. Therefore, every term of the 
sequence 

<xi : i in J> 

is in [ -b , b]. Since CRi(x) — > x uniformly on [-b, b], the 

sequence 

<CRi(xi) - xi : i in J> 

goes to zero, i.e. for any positive c in 1 , there exists n such 
that 

i in J and i > n -> |CRi(xi) - xi | < c 
or 

{ i in I : |CRi(xi) - xi| < c } contains 
J intersect {n + 1, n + 2, ...} 
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J and (n +1, n + 2, ...},are both in F, so their intersection is 
in F, so any set containing their intersection is in F, so 

( i in I : |CR(xi) - xi| < c } 
is in F. Therefore, 

|CR(x) - x| < inj(c) 

in U. Since this was true for all positive c in R, 

diff (CR(x) - x) 
or 

CR(x) » x QED . 

At this point, we can give a precise statement of the first 
cropping function axiom. Any linearly ordered set is infinite if 
and only if there is neither a strictly ascending sequence nor a 
strictly descending sequence. This is a consequence of Ramsey's 

theorem. We have found in our proofs that an adequate axiom on 
MR is : 

all s : RRseq [all n :NN [MR(s(n)) & s(n + 1)<- S (n)] 

-> some n : NN [all m:NN [n<m -> s(n) = s(m) ] ] ] 

Since this sentence holds in every Mi, it holds in U. The 
corresponding statement for ascending sequences also holds in U. 
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We will formalize the first cropping function axiom by the these 
two statements. 

The following is a list of useful theorems which can be deduced 
from the above axioms. 

1. Every standard real is finite. 

2. If x is finite, there exists a standard y such that y =«= 
x . 

3. The finite elements of ER form a convex [3] proper subring 
of U. 

A. fin(x*y) — > fin(x) or fin(y) 

5. The inverses of non-zero infinitesimals are infinite. 

6. The inverses of infinite numbers are infinitesimal. 

7. Ihe infinitesimal elements form an ideal in the finite 
elements. 

8. diff(x*y) -> diff(x) or di.ff(y) 

9. The infinitesimal elements are convex. 

10. == is an equivalence relation. 


3. Given a set S with a partial ordering <= on it 

S is convex iff all x,y,z (x in T and z in T and x 
in 1 ; 


a 

<= 


subset T of 
y <= 2 -> y 
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11. fln(x) -> f in(CR(x)) 


2.4.3 Rationale 

What’s "asymptotic" about the above? Suppose we fix a program 
P which takes a floating point number as input and returns a 
floating point number as output, and suppose this program always 
terminates. Denote the function so computed by P. A given CR 
over £ determines which numbers can be inputs to P (namely the 
fixed points of CR), and also determines what P(a) is for a given 
input (obtained by executing P on a with floating point 
operations being the precise operations followed by applying CR). 
Thus, given a sequence <CRi> as above, we get a sequence <Pi> of 
functions, with each Pi defined on the fixed points of CRi in 
range. This sequence of functions induces a single function 
(call it P) defined on the fixed points of CR in the 
ultraproduct. This function will have the 

first-order-expressible properties possessed by "almost all" 
(i.e. all but finitely many) of the Pj *s. 

Suppose P was intended to compute the square root of its 
input* How would we express the specification for a square root 
program mentioned above? One way to express it might be 

a >= 0 & MR (a) & fin(a) -> P(a)*P(a) ->« a 

Suppose we could prove the above statement about P from axioms 
in NSA. Now, suppose there was some sequence <CRi> going to 
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some 


perfect precision, some . positive (standard) real b, and 
sequence of (standard) reals <ai> such that each ai is in the 
interval [-b, b], CRi(ai) = ai, and the sequence <Pi(ai )*Pi (ai ) - 
ai> docs not converge to zero (in other words, as the precision 
increases, we can choose machine representable numbers in a fixed 
bounded interval such that the result of running Pi on ai doesn’t 
get closer and closer to the square root of ai). There exists a 

positive real number c such that for all i, there exists j > i 
such that 

|Pj(aj)*Pj(aj) - aj| > c 

Thus, the set 

J = (j : |Pj(aj)*Pj(aj) - a j | > c ) 

is infinite. Let F be a non-principal ultrafilter containing J. 
In the following statements, [<ai>] is denoted by a, inj(b) is 
denoted (by an admitted abuse of notation) by b, and inj(-b) = 
-inj(b) by -b. Similarly, inj(c) is denoted by c. 

Since 

{ i in 1 ; CRi(ai) = ai } = I 

which is in F, 

CR(a) = a 

in the ultraproduct. Since 
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( i in I : -b <« ai <■ b } «« I 


which is in F, 

-b <= a <= b 
or 

I a I < b 

in the ultraproduct. Since std(b), this implies fin(a). 

By choice of F, 

{ i in I : | Pi(ai)*Pi(ai ) - ai | > c } 

is in F, so 

|P(a)*P(a) - a| > c 

in the ultraproduct. Since std(c) and c > 0, 

~diff(P( a )*P(a) - a) 
or 

”(P(a)*P(a) == a) 

This contradicts our original supposition that we were able to 
prove P(a)*P(a) == a for all non-negative, finite fixed points of 
CR . Thus, if we could prove the proposed postcondition for P in 
our system, it would imply that for any <CRi>, any b and any <ai> 
as above, Pi (ai)*Pi(ai) — ai — — > 0. This is in some sense what we 
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mtan by saying "P computes the square root function 
asymptotically". Thus the above formalization allows to express 
specifications of the asymptotic behavior of programs easily and 
naturally. It also provides us with a natural formalization of 
the concepts of "large" (i.e. infinite), "small" (i. e . 
infinitesimal) and "close" (i.e. infinitely close). 

2.4.4 Induction and Recursion in Non-Standard Models 

We will need to do proofs by induction on N in the course of 
proving programs, and thus we need to investigate how this is 
done in a non-standard setting. 

The set-theoretic statement of the induction principle ("Every 
subset of the integers containing 0 and closed under successor is 
the set of all integers") does not hold for non-standard models 
of arithmetic (the proper subset consisting of the standard 
integers violates the principle). The first-order formula 
statement of the induction principle ("For every first-order 
formula A(i) where i is a free variable i of sort NN, 

A(0) & all i :NN [A(i) -> A(i+1)] -> all i : NN [A(i)]") 

also does not hold for an arbitrary formula of L. For example, it 
does not hold for A = "std(i)". It does, however, hold if A is an 
internal formula. Why is this restriction sufficient? 

If A is internal, then every symbol occurring in A has an 
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interpretation in each Hi. In each Mi, NN is interpreted as the 
standard non-negative integers, and so the above formula stating 
the induction principle for A is true in Mi. Since it is true in 
every Mi, it is true in U. The preceding argument obviously would 

not go through if A contained an occurrence of a symbol which has 
no interpretation in the Mi. 

If A(i) is not a formula of LO, the following more limited 
statement holds: 

A(0) & all i :NN [A(i) -> A(i+1)] -> 
all j :NN [std(j) -> A ( j ) ] 

that is, if A holds for 0, and A(i) implies A(i+1), then A holds 
for all standard integers. This is true because the set of all 
standard integers which satisfy A is a subset of the standard 
integers which contains 0 and is closed under the operation of 
adding 1. By the principle of set induction, which holds for the 
standard integers, the set of all standard integers satisfying A 
must contain all standard integers. 

The same general principle also holds for definitions of 
hypersequences by recursion. That is, a definition which 
involves some external symbols will only define the hypersequence 
on the standard integers. 

To summarize: 
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1 . 


Proofs by induction. in the non-standard case work just as 
in the standard case for statements expressible in LO. 

Proofs by induction of statements not expressible in LO are 
more limited. 

2. Definitions of hypersequences in the non-standard case, 
whether by formula or by recursion, work just as in the 
standard case, but only for definitions expressible in LO. 
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Chapter 3 


The Asymptotic Paradigm and the Verification Condition Approach 

— Verifi cation Condition Generation 

The classical method of proving programs correct entails the 
use of two languages: a programming language and an assertion 
language. The latter is usually an extension of the Boolean 
expression portion of the former. The basic datum for the 
Verification Condition Generation (VCG) approach is an asserted 
program, i.e. a Hoare sentence { p } P (q) with Pre Condtion p and 
Post Condition q togethr with embedded assertions attached to 
some of the executable statements in P. The only requirement is 
that there be an attached assertion within each loop. The basic 
theorem of Floyd shows how to construct mathematical statements 
SI, Sn with the property that if all the SI are true then so 

is the starting Hoare sentence. The generation of the 
verification conditions (VCs) SI, .., Sn is formal and schematic, 
that is they only contain the symbols found in the statements of 
P and don’t depend on the meaning of the symbols. Thus one can 
speak of proving the Hoare sentence for programs which really 
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can’t be executed, such as programs over our non-standard 
structures in NSM. What one really proves is the Verification 
Conditions SI, Sn from the chosen axioms of NSA. This is a 
formal exercise. As we have illustrated, the truth of the Hoare 
sentences over the structures U in NSM implies increasing 

precision over the really executable domains of actual machine 
reals. 


How do we apply the asymptotic paradigm in the context of the 
VC method? We simply think of the floating point variables of 
the program to be verified as ranging over the 
machine-representable elements of some model in NSM. Since there 
are certain features of most programming languages which involve 
interaction between floating point and integer variables (such as 
rounding a real off to an integer, we should also think of the 
integer variables as ranging over a non-standard model of the 
integers. Pre- and postconditions and internal assertions for 
programs can then be written using external symbols, and the 
asymptotic axioms can be used to prove VCs. 


The above approach can be used to do non— trivial asymptotic 
analysis of programs. There is a problem with it however. 
Consider the following asserted program* 
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{TRUE} 


X := 1; 

DO WHILE (X//2 < X); 

{ 0 < X <= 1 & “diff(X) & MR(X) } 

X := X//2 ; 

END; 

END; 

{FALSE} 

X is a floating point variable. 

For any finite machine real input XO this program halts. Since 
that is the case the Hoare sentence {TRUE} P {FALSE} is not 
true. On the other hand, the VCs in the non-standard case are 
provable! We can prove that the loop invariant is true when the 
loop is entered, that it is an invariant of the loop, and that 
the Post-Condition follows from the negation of the loop guard 
and the invariant. Here are the VCs. 

VC 1 

TRUE 

IMPLIES 

0 < 1 <“ 1 & ~dif f ( 1 ) & MR( 1 ) 

VC 2 

0 < X <- 1 & "diff(X) & X//2 < X & MR(X) 
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IMPLIES 


0 < X//2 <= 1 & ~dif f (X//2 ) & MR(X // 2) 


VC 3 

(X //2 < X) & 0 < X <= 1 & "diff(X) & MR(X) 

IMPLIES 

FALSE 
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The first VC is easily proved, since 1 > 0 is an axiom of 
ordered fields, <» is reflexive and 1 is standard and thus not 
infinitesimal. We also see that we need MR(1). Our cropping 
axioms do not imply this and this is an oversight which we 
discovered through experimentation. That MR(1) holds follows 
from our final cropping axiom Axiom 10. 

To prove the second VC, assume the hypothesis. First, we will 
prove "diff (X//2). Suppose diff(X//2); then X//2 == 0. 0 < X <= 1 
implies 0 < X/2 <= 1/2. Therefore, X/2 is finite, so 


X/2 == CR (X/2 ) = X//2 == 0 

so X/2 — 0. But multiplying both sides by 2 (a finite number) 

gives us X == 0, i.e. diff(X), a contradiction. Therefore, 
~diff(X//2). 

Next, we want to prove 0 < X//2 <= 1. By the hypothesis, X//2 < 

X <= 1, so X//2 <= 1. 0 < X implies that 0 < X/2, so by 
monotonicity of CR , 


0 « CR(0) <= CR (X/2 ) = X//2 

If X//2 = 0, then diff(X//2) which we have already disproved. 

Therefore, 0 < X//2. Finally one has MR(X // 2) since the X // 2 
is CR(X / 2). This finishs the second VC. 

To show the third VC we show that the hypthosis 
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(X 111 < X) & .0 < X <= 1 & ~dif f (X) & MR(X) 

is always false (so that it implies FALSE). In fact the loop 
guard (X // 2 < X) will always be true when 0 < X and ”diff(X). 
How could X // 2 a CR (X / 2) be >- X? Since X > 0 we have 0 < X/2 
< X. Applying CR we get 

CR(0) = 0 <= X//2 <= CR(X) . 

If X // 2 >= X then 


X <= X // 2 <= CR(X) 
so that applying CR we would get 

CR(X) <= X 111 <= CR(X) 

which shows that X // 2 is CR(X). But MR(X) so X // 2 is X. But X 
// 2 == X/2 SO X » X/2. Multiplying both sides by 2 and then 
subtracting X gives X == 0, a contradiction. 

What went wrong? The Hoare sentence is proved, yet it is not 
true in the finite cases. What the proof of VC3 actually shows 
as that in non-standard models the program does not terminate. 
But since it doesn't terminate it must be that the sequence of 
values of X for the successive iterations of the loop form an 
infinite, strictly descending sequence of machine reals, which 
violates the first cropping function axiom. 

Consider a finite case with a fixed CR over R.. By the 
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monotonicity of CR , we can prove that the sequence of values of X 
is non-increasing. By the first cropping function axiom, this 
sequence must eventually reach a fixed point, at which point the 
loop terminates. This only happens, however, because of rounding 
error or underflow. At some point, X becomes so small that 
either a division by 2 causes it to underflow to 0, which is then 
the fixed point, or X//2 is rounded up to X, in which case that 
value of X becomes the fixed point. If we take a sequence of 
cropping functions <CRi> going to perfect precision, we find that 
it takes more and more iterations of the loop before this 
happens. Let Ni be the number of iterations it takes for the 
loop to terminate with cropping function CRi. The corresponding 
integer [<Ni>] in the ultraproduct is non-standard because Ni 
goes to infinity. Thus, the loop terminates when executing over 
the non-standard domain, but only in a non-standard number of 
steps. Nothing in our naive application of the asymptotic 
paradigm made allowance for a program to execute for a 
non-standard number of steps. 


We can, however, incorporate the idea of a program executing 
for a non-standard number of steps into the VC approach. Given a 
sequence <CRi>, we can imagine running a program P with cropping 
function CRi for each i. Suppose P contains a floating point 
variable X. As P runs with CRi, X takes on various machine real 
values for various numbers of execution steps. This defines a 
sequence of machine real values, one sequence for each floating 
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point variable. Likewise. we get a sequence of integers for each 
integer variable, and a sequence of "control points" which define 
how control passes through P as execution progresses. We get one 
such collection of sequences for each CRi. These sequences can be 
combined into a collection of hypersequences in the corresponding 
ultraproduct. We get a hy persequence of non-standard reals for 
each floating point variable, a hypersequence of integers 
(possibly non-standard) for each integer variable, and a 
hypersequence of control points in P. These hypersequences define 
the execution of P over the non-standard domain for non-standard 
numbers of steps. 

How does this idea of hypersequences actually enter into the 
verification of a program in the VC approach? Actually, the 
impact is relatively minimal. The same verification conditions 
are generated, and they are proved in the same way as before. 
There is only one major difference, which occurs in the proof of 
loop invariants. The proof of a loop invariant is essentially a 
proof by induction on the number of iterations of the loop. In 
other words, we are essentially proving 


all n:N [ the loop invariant is true after n iterations ] 


by induction on n. However, recall 
induction over the non-standard integers, 
proving is external, the proof only holds 
This means that if a loop invariant is an 


that when performing 
if the statement we are 
for standard integers, 
external statement, the 
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usual VC method proof only proves that the loop invariant is true 
for a standard number of interations. Thus, if we need to use an 
external invariant, we must also prove that if the loop ever 
terminates, it terminates in a standard number of steps. 
Otherwise, the loop may run for a non-standard number of steps 
and then terminate with the loop invariant false. Notice that 
this is exactly what happens in the example above: the loop 
mavriant contains the external symbol , an d so the invariant 
is only true for a standard number of iterations. As shown 
above, the loop does terminate after a non-standard number of 
steps with the loop invariant false. 

The need to prove termination in a standard number of steps for 
external loop invariants is the only real change that must be 
made in the VC method in order to apply the asymptotic paradigm. 
For internal loop invariants, the method works exactly as 
before. We present an example of an external invariant and a 
proof of termination in a standard number of steps, in the 
section 4. In general we will wish to avoid using external loop 
invariants wherever possible, since our experience in examining 
programs executing over non-standard domains suggests that such 
programs rarely execute in a standard number of steps. In some 
cases we can replace an external invariant by an internal 
invariant which implies the original invariant. For example, we 
will often need to show that for every iteration of a loop, 
certain quantities are finite. We cannot prove this by making it 
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part of the loop invariant, since "fin" is an external symbol. 
What we generally do in such cases is to prove the appropriate 
quantities finite by showing that their values are bounded by 
certain fixed numbers which are finite. Saying that a number is 
between two other numbers is an internal statement, and if the 
bounds are fixed finite numbers, then the quantity they bound is 
finite by the convexity of the finite numbers. 



We wish to write and verify a program to compute an 
approximation for y where 


dy 

— = y, y(0) = 1 

dx 


by the Euler method. Consider the following asserted program 
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{ X > 0 & fin(X) & N > 0 

Y := 1; 

POS : = 0 ; 

I := 0; 

DO WHILE(POS < X); 

{ POS == (I*X)/N & Y == 

Y : = Y**( 1 ++ (X//N) ) ; 

POS := POS ++ (X//N) ; 

I := I + 1 ; 

END; 

END; 

{ Y == (1 + (X/N) ) A N } 


& fin(N) } 


(1 + (X/N))*! & I <=s N } 


3-11 



x is the X for which y(x) is being computed. N is the number of 
steps to be performed in applying the Euler method. Y is the 
output of the program. POS is the current value of x in the 
Euler method. I is the number of times the loop has been 
executed, and is in the program primarily to be used in the loop 
invariant and in the proof of correctness. Note that we are 
illicitly assuming that whenever a floating point operation is 


performed using an integer variable, the value of the integer 
variable is converted to the corresponding element of R. 


Let us first examine the pre- and postconditions. The 

postcondition simply state*? that- A _ 

p y otates that the ouput value should be 

infinitely close to the exact value given by the Euler method. 

What this says in terms of asymptotic behavior is that as the 

precision of the cropping function increases, the output value 

should converge to the exact Euler method value. 


The precondition requires that both X and N be positive and 
finite. What does the finiteness requirement mean? Recall that 
"finite" essentially means "bounded by a fixed number as the 
precision increases." Suppose we increased X without bound as 
the precision increased. As X becomes larger and larger, the 
magnitude of the error in computations like X//N also becomes 
larger. If we increased X fast enough, this might offset the 
increasing precision of CR , and so the output might not converge 
to the exact answer. Suppose, on the other hand, we increased N 
without bound as the precision increased. As N becomes larger 
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iterates increases, and 


and larger, the number of. times the loop 

so the cumulative error in the entire computation increases. 

Again, if N increased fast enough, this could offset the 

increasing precision of CR . In fact, if we left out these 

restrictions on X and N we would not be able to prove the 

program. 

The loop invariant says that the current value of POS is 

infinitely close to I times the step size, and Y is infinitely 
close to the exact Euler method value after I iterations, and 
that I is <= the total number of steps to be performed in the 
Euler method. The first thing to notice about this invariant is 
that it is not an internal formula because it contains the symbol 
== which is defined using std. Thus, the standard VC methods 
will only prove that the loop invariant holds for a standard 
number of iterations of the loop. We will have to show that the 
loop terminates in a standard number of steps. 

Let us now examine the VCs for this program. There are three 
of them. The first one says that if the precondition is met, the 
loop invariant is true when the loop is initially entered. The 
second says that if the loop invariant is true at the top of the 
loop end the loop guard is true, then the loop invariant will be 
true after the loop body is executed. The third VC says that if 
the loop invariant is true at the top of the loop and the loop 

guard is false, then the postcondition is true when the program 
terminates . 
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VC 1 

X > 0 & fin(X) & N > 0 & fin(N) 

IMPLIES 

0 » (0*X)/N & 1 — (1 + (X/N) ) A 0 & 0 <«= N 

VC 2 

POS == (I*X)/N & Y == (1 + (X/N)) A I & I <= N & POS < X 

IMPLIES 

POS++(X//N) == ((I+1)*X)/N 
& Y**(l++(x//N)) == (1+(X/N)) A (1+1 ) & 1+1 <= N 


VC 3 

POS == ( I*X) /N & Y -o (X + (X/N) ) A I & I <= N & ~ (POS < X) 

IMPLIES 

Y == ( 1 + (X/N ) ) A N 
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The proof of VC 1 is simple, since the conclusion of the 
implication simplifies to 

0 == 0 & 1 == 1 & 0 <= N 

The first two conjunctions are true because **** is an equivalence 
relation. The last is true a fortiori . since 0 < N. 

Next, examine VC 3. The proof breaks into two cases, the case 
when X is infinitesimal and the case when it is not. 

If X is infinitesimal, then X == 0. N is finite and so is a 

standard integer. Therefore, 1/N is a standard, non-zero 

rational number and so we can multiply both sides of X == 0 to 
get 

X/N == 0 

From this we get 


1 + X/N == 1 

We now use the fact that for any standard integer J, 

(1 + X/N) * J « l (1) 

The proof of this is in Appendix A. 

From the hypothesis of VC 3. we have I <. N, and N is standard, 
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so I is standard. By applying (1) with J = I and J = N, we get 
Y =ss (1 + (X/N)) A I 1 == (1 + (X/N) ) A N 
This proves VC 3 in the case where X is infinitesimal. 

Now, suppose X is not infinitesimal. We have 

I <= N & Y == (1 + (X/N) ) A I 

If we can prove that in fact I = N, then the conclusion of VC 3 
will be proved . 

Suppose I < N. Then (I*X)/N < X. Thus we have that POS >= X but 
POS is infinitely close to something less than X. This implies 
that POS == X, and so (I*X)/N == X by the transitivity of ==. We 
can now multiply both sides of (I*X)/N == X by N/X (note that 
this is finite because N is finite and X is not infinitesimal) to 
get I == N. But we assumed I < N, and I and N are both integers, 
so the difference between them must be at least 1 and so cannot 
be infinitesimal, a contradiction. Therefore I must be equal to 
N and VC 3 is proved. 

Now examine VC 2. First we prove 

POS ++ (X//N) == ((I + 1)*X)/N 

from the hypothesis of the VC. First we prove that all the 
quantities we need to deal with are finite. I is a positive 
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integer, I <«= N and N is finite, so I is finite. X is finite and 
N is a positive integer, so X/N is finite. Therefore, 


X//N « CR(X/N) == X/N 

and so X//N is finite. I finite and X/N finite implies (I*X)/N 

is finite. POS is infinitely close to (I*X)/N, so POS is 

finite. 

By adding POS to both sides of X//N == X/N, we get 


POS + (X//N) == POS + (X/N) 

The left side is a sum of two finite numbers and so is finite. 
Therefore , 

POS ++ (X//N) = CR(P0S + (X//N) ) 

== POS + (X//N) 

=■= POS + (X/N) 

== ((I*X)/N) + (X/N) 

- ((I + 1)*X)/N 

Next, we prove 


Y **( 1 ++ (X//N) ) « (1 + (X/N) ) A (I + 1) 

X/N and X//N are finite, so 1 + (X/N) and 1 + (X//N) are finite. 

From this we get 
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1 ++ (X//N) = CR( 1 + (X//N) ) 

~ 1 + (X//N) 

== 1 + (X/N) 

We now use the fact that if Z is a finite floating point number 
and J is a finite integer, Z A J is finite. The proof is in 
Appendix A. By this, (1 + (X/N)) A I is finite, and Y is infinitely 
close to it, so Y is finite. Therefore, Y*(l ++ (X//N)) is 
finite, and so 

Y **(l ++ (X//N)) = CR(Y*( 1 ++ (X//N) ) 

== Y*( 1 ++ (X//N) ) 

== Y *( 1 + (X/N)) 

== (1 + (X/N ) ) * I * (1 + (X/N)) 

- (1 + (X/N) ) A (I + 1) 

Finally, we wish to prove that I + 1 <= N, i.e. that I < N. 
Suppose not, i.e. suppose I >= N. We have I <= N, so I = N. By 
substituting into the other conjuncts of the hypothesis, we get 
POS == X and Y == (1+(X/N)) A N. We also have POS < X. Is this a 
contradiction? The answer is no. It is possible to have POS < X 
and at the same time POS == X, as long as POS is only less than X 
by an infinitesimal amount. Does this mean that the program has 
an error in it, or do we just need to change our loop invariant 
to one that will give us provable VCs? 
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Consider the situation in which I = N, POS « X and POS < X. 
This occurs when the loop has executed N times, POS has been 
incremented by X/N each time, but due to rounding errors, POS 
turns out to be slightly less (i.e. infinitesimally less) than X. 
In this case, the loop will execute at least once more, which 
will result in a value for Y that is too large. Thus we see that 
this program is actually incorrect. The basic problem is that 
the loop guard cannot be trusted to terminate the loop correctly 
due to roundoff error in incrementing POS. The easiest way to fix 
the program is to change the guard to I < N. Having done this, we 
have no need of the variable POS, since it is not used anywhere 
but in the loop guard, so we can change the program to 

{ X > 0 & fin(X) & N > 0 & fin(N) } 

Y := 1 ; 

I := 0; 

DO WHILE(I < N); 

{ Y == ( 1 + (X/N) ) A I & I <= N } 

Y := Y**( 1 ++ (X//N) ) ; 

I := I + 1; 

END; 

END; 

{ Y == (1 + (X/N))*N } 

The proofs of the three VCs generated for this program are 
proved by arguments similar to those above (in fact, the proofs 
are even easier). Note that for this program, there is no 
difficulty in proving that the loop terminates after N 
iterations, since we have I <= N from the invariant and I >= N 
from the negation of the loop guard. 

The loop invariant for the fixed program is still an external 
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formula, so the proof of the second VC only implies that the loop 
invariant holds for a standard number of iterations of the loop. 
We must therefore show that the loop terminates in a standard 
number of steps. This is easy though. The quantity N - I is an 
integer which decreases by 1 every time the loop body is 
executed. Therefore the loop cannot iterate more than N times, 
and N is standard. 


3.3 Finding a Zero of a Continuous Function by Bisection 

The second example is also carried out in the VC approach. 
Suppose we have a continuous function fO from R. to It, and two 
numbers A and B such that fO(A) and fO(B) are of opposite sign. 
Ife know from the Intermediate Value Theorem that fO must have a 
zero between A and B. We wish to write and verify a program which 
finds an approximation to that zero. Before we present a 
candidate program, let us examine the problem to see how we might 
write such a program and what its pre- and postconditions should 
be. 

First of all, to use non-standard analysis on a function we 
must make a non-standard extension of it. Given a non-standard 
model U from NSM, we can get a non-standard extension of fO by 
adding a unary function symbol, say F, to L, and interpreting it 
as fO in each component of U. These interpretations will induce 
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an interpretation of F in. U which we will call f. For any x in 

R, 

f(inj(x)) = f ( [<x,x, ...>]) 

= [<fO(x),fO(x) , . . .>] 

- in j ( f 0(x ) ) 

That is, f is identical with fO on the standard elements. Thus, 
in particular, f takes standard elements to standard elements. 

Next, note that in general, we will not really be able to 
compute f, but rather some approximation to f given by a 

program. Suppose we have a program which computes a function g 
such that 


all x : RR [ fin(x) -> MR(g(x)) & g ( x ) == f( x )] 

In other words, g is a "machine version" of f on the finite 
reals. We will assume that the A and B we have are machine 
reals, and that g is a sufficiently good approximation to f at A 
and B that g(A) and g(B) are also of opposite sign (if these two 
assumptions do not hold we can hardly expect to be able to 
compute an approximate zero for fO). 

How would we go about finding a zero of fO? The usual method 
is to use some algorithm which generates a number C between A and 
B which is a "guess" at the zero (we will use bisection). If 
g(C) = 0, the process terminates. If g(C) is not 0, then it is 
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either opposite in sign tp g(A) or g(B). Whichever of the "old" 
endpoints has opposite sign, it and C form the endpoints of a 
new, smaller interval, and the process is repeated with the new 
endpoints. This process is iterated until either a zero of g is 
found or the endpoints become "close". In the latter case, either 

one of the endpoints can be taken as the approximation to the 
zero. 

We can formulate the specifications for the program as the 
following pre- and postconditions. AO and BO are the initial 
values of the endpoints; to simplify writing, OPP(x,y) will be 
used as an abbreviation for "x < 0 < y or y < 0 < x". 

PRE: fin(AO) & fin(BO) & 0PP(g(A0) , g(BO)) 

POST: fin(A) & fin (B) & fin(C) 

& [g(C) - 0 or (OPP (g(A) ,g(B)) & A == B)] 

As usual, the finiteness restrictions on the values of AO and BO 
simply signify that we do not expect the program to give us 
better and better approximations as the precision increases, if 
it is given larger and larger inputs also. The postcondition 
simply states that we have either found a zero of g or when the 
program terminates, the values of g at the endpoints are still of 

opposite sign and the endpoints are "close" (i.e. infinitely 
close ) . 

Why does the above postcondition ensure that we have found a 
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number infinitely close .to a zero of the original function fO? 

To show this, we will make use of the following fact, which is 
proved in Appendix A: 


all x : RR [ fin(x) -> 

some y:RR [std(y) & y == x & fO(y) == g(x)] 

In other words, for any finite real x, there is a standard real y 

infinitely close to x such that fO(y) is infinitely close to 
g(x) . 

Suppose the program terminates with g(C) = 0. By the above 

fact, there is a standard D such that D == C and fO(D) == g(C) = 

0. But f 0(D) is standard, and the only standard infinitesimal 

number is 0, therefore fO(D) = 0. Thus C is infinitely close to a 

zero of f0. 

Suppose the program terminates with A == B and 0PP(g(A) , g(B) ) . 
Again, applying the above fact we get standard reals D1 and D2 
such that D1 == A, D2 == B, f0(Dl) « g(A ) and fO(D2) == g(B). 
Since A == B, by transitivity D1 « D2. Since D1 and D2 are 

standard, D1 = D2. Therefore f0(Dl) = fO(D2) == g(B). Thus 

f0(Dl) is a standard real which is infinitely close to two 
numbers of opposite sign (g(A) and g(B)), and so it must be 
infinitely close to 0. Since it is standard, it must be 0, and so 
both A and B are infinitely close to a zero of fO. 

How can we code the process described above as a program so 
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that it will stop if it finds a zero or continue until A and B 
are infinitely close? We cannot simply test to see if two 
numbers are infinitely close, because "infinitely close" is an 
asymptotic property which is not true or false for a given 
machine or precision. Thus, we must find another way to ensure 
that if no zero is found, the program will terminate with A and B 
infinitely close. 


Consider the following asserted program. PRE and POST stand 
for the conditions given above. We have added AO <= BO to the 
precondition just to simplify the formulas slightly: 


{ PRE & AO <= BO } 

A := AO; 

B := BO; 

C := (A ++ B ) / / 2 ; 

DO WHILE(g(C) <> 0 & A < C < B); 

{ AO <= A <= BO & AO <= B <= BO 
& OPP(g(A),g(B)) & C = (A ++ B)//2 } 


IF OPP (g(A),g(C)) 
THEN B : = C ; 


ELSE A : = C; 

C (A ++ B) / / 2 ; 
END; 


END; 


{POST} 


Note that the loop invariant is an internal statement, and so it 

need not be proved that the loop terminates in a standard number 
of steps. 

One thing about the above program needs explanation, namely, 
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why do we have the second .conjunct in the loop guard? Since C is 
always set to the average of A and B, isn’t C always between A 
and B? The answer is no, because roundoff error in computing the 
average may result in a value for C that is not strictly netween 
A and B. Of course, such roundoff error will only happen when A 
and B are very close together. We will show below that since A 
and B get closer and closer as the loop continues to execute, 
such a roundoff error eventually must happen. This is the way we 

ensure that if no zero is found the program will terminate with A 
== B. 

Let us first examine the VCs for this program. There are four 
of them. The first one says that if the precondition is true 
initially then the loop invariant will be true when the loop is 
first entered. The second says that the loop invariant is 
preserved by the execution of the loop in the case when the THEN 
branch of the IF THEN ELSE is followed, and the third VC says the 
same in the case where the ELSE branch is taken. The fourth VC 

says that if the loop terminates with the loop invariant true 
then the POST is true. 
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VC 1 


PRE & AO <= BO 
IMPLIES 

AO <= AO <= BO & AO <= BO < 
& OPP(g(AO),g(BO)) & (A ++ B)//2 = 

VC 2 

AO <= A <= BO & AO <= B <= 
& OPP(g(A),g(B)) & C = (A ++ 

& g(c) <> o & a < c < ; 

& OPP(g ( A) , g(C ) ) 
IMPLIES 

AO <= A <= BO & AO <= C <= 
& OPP(g(A) , g(C)) & (A ++ C)//2 = 




BO 

(A ++ B ) 111 


BO 

B)//2 


BO 

A ++ C) //2 
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VC 3 


AO <= A <= BO & AO <= B < = BO 
& OPP(g(A) ,g(B)) & C - (A ++ B)//2 
& g(C) <>0 & A < C < B 

& ~OPP(g(A) , g(C) ) 

IMPLIES 

AO <= C <= BO & AO <= B <= BO 
& OPP( g( C ) » g(B) ) & (C. ++ B)//2 = (C ++ B)//2 

VC 4 


AO <= A <= BO & AO <= B <= BO 
& OPP( g(A),g(B)) & C = (A ++ B)//2 
& (8(C) =0 or C <= A or B <= C) 

IMPLIES 

POST 
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The proof of the first VC is trivial. The only thing which 
needs to be proved for the second VC is that if A and B are both 
between AO and BO and C is strictly between A and B, then C is 
between AO and BO, This is also trivial. 

For the third VC, we need to prove that if g(A) and g(B) are of 
opposite sign and g(A) and g(C) are not of opposite sign, then 
g(C) and g(B) are of opposite sign. This is also trivial. 

Now examine the forth VC. Assume the hypothesis. We will first 
prove that A, B and C are finite. AO and BO are finite and A and 
B are both between AO and BO. Since the finite elements are 
convex, A and B must be finite. Therefore, A + B is finite, so A 
++ B = CR( A + B) is finite. This implies that (A ++ B)/2 is 
finite, and so C = (A ++ B)//2 = CR((A ++ B)/2) is finite. 

If C(C) = 0, the proof is done. Otherwise, we must prove that 
g(A) and g(B) have opposite sign and A ~ B. The first is true by 
hypothesis. Suppose A is not infinitely close to B. By the 
finiteness statements proved above, 


C = (A ++ B)//2 
= CR( (A ++ B)/2) 
~ (A ++ B)/2 
= CR ( A + B)/2 
~ (A + B)/2 
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Since g(A) and g(B) are. of opposite sign, A is not equal to B. 
Therefore, (A + B)/2 is strictly between A and B. By hypothesis, 
C is either <= A or >= B. If C <» A, then C <= A <= (A + B)/2 and 
C »« (A + B)/2, so A **= (A + B)/2. Simplifying, we get A == B. 

The proof is similar in the case where B <= C. This completes the 
proof of the fourth VC. 


How can we be sure this program terminates? Suppose it didn't 
terminate. Then C is always strictly between A and B when 
control reaches the top of the loop. At each iteration, either A 
is set to C, in which case the value of A increases, or B is set 
to C, in which case the value of B decreases. If the loop never 
terminates, then either A must increase infinitely often or B 
must decrease infinitely often (notice that "infinite" here does 
not mean hyperfinite, but actually hyperinfinite). If A 
increases infinitely often, then we can define an infinite 
ascending sequence of machine reals, which contradicts the first 
cropping function axiom. If B decreases infinitely often, then 
we can define an infinite descending sequence of machine reals, 
again a contradiction. Thus the program must terminate. 


Notice that this way of ensuring termination is unlike the 
method usually used for programs of this type. Usually the 
program terminates when A and B come within a certain fixed (or 
sometimes user-supplied) distance of each other. When such a 
fixed distance is used, we cannot expect the results of the 
program to be closer than that distance to the actual zero. In 
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the ?bove program, however, the program terminates only when the 
distance between A and B is very small compared to the precision 
of the machine’s arithmetic. In fact, it only terminates when 
further iterations would move C further away from the zero. Not 
only does the above program tend to use all of the precision 
available on a given machine, the same program run on more and 
more precise machines will give more and more precise answers. 
Thus, the asymptotic paradigm is not only a way of analyzing 
programs, it is also useful for designing programs. 


Published versions of the above algorithm actually contain an 
error! Consider again our program 


{ PRE & AO <= BO ) 

A := AO; 

B := BO; 

C := (A ++ B)//2 ; 

DO WHILE(g(C) <> 0 & A < C < B); 

{ AO <= A <= BO & AO <= B <= BO 
& OPP(g(A) , g(B) ) & C = (A ++ B)//2 } 

IF OPP(g(A),g(C)) 

THEN B := C; 

ELSE A := C; 

C := (A ++ B)//2; 

END; 

END; 

{POST} 


and the form similar to how it appears in IMSL 
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A AO; 

B :■* BO; 

C (A ++ B)//2; 

DO WHILE(g(C) <> 0 & A < C < B); 

IF g(A) ** g(C) < 0 
THEN B := C; 

ELSE A :® C; 

C :» (A ++ B ) / / 2 ; 

END; 

END; 

The program is incorrect since while g(A) * g(C) may be < 0 the 
machine product may round up to 0 so that g(A) ** g(B) < 0 should 
not be used in place of OPP(g(A), g(B)). Of course, this program 
will give a correct answer for many inputs so that testing might 
not uncover the error. To show the power of our method let us 
consider whether {PRE} P {POST} is a true Hoare sentence in 
non-standard universes where P is the above program and we use 
the same Pre and Post— Conditions as before namely 

PRE: fin(AO) & fin(BO) & OPP(g(AO) , g(BO)) 

POST: fin(A) & fin (B) & fin(C) 

& [ 8(C) = 0 or (OPP(g(A).g(B)) & A == B) ] 

where OPP(x, y) is 

x< 0 <yory< 0 <x. 

Let f(x) = x and g(x) = CR(x) be the function which approximates 
it. Let AO and BO be non-inf ini tesmal , finite negative and 
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positive numbers respectively with g(AO) «= AO and g(BO) - BO such 
that (AO ++ BO) // 2 = CO is a positive infinitesmal (note: g(C0) 
• CO) with AO ** CO e 0. It is possible to get explicit examples 
of this by choosing the CRi, ei , m+i, m-i, etc. appropriately. 
Then after the first iteration of the loop we have that A «= g(A), 
B ■ g(B), and C » g(C) are all positive. Furthermore after each 
bisection the right hand half is chosen since A ** C >=* 0. 
Furthermore since BO was non-inf initesmal we always have A < C < 
B so that the loop never terminates. By the same argument we 
gave for the original program the loop does terminate in a 
non-standard number of steps. The Hoare sentence is then false. 
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Chapter 4 


Applying thn Asymptotic Paradigm: The Programming Logic Approach 

The Verification Condition approach has been used since the 
late 60 s to prove programs. It has generated much criticism 
since verification environments based on this approach generally 
lead to low productivity. One of the identified problems is the 
use of two languages; the programming and the assertional, 
mathematical. When an unprovable VC in the mathematical language 
is uncovered the corresponding error in the asserted program must 
be found. This error is either an error in the logic of the 
program or an inappropriate embedded assertion. It sometimes 
difficult to discern which of these alternatives is the case. If 
the error is in the program's logic the place where that error 
occurs may not correlate simply to the place where the false VC 
was generated. When the error is corrected the new asserted 
program is resubmitted to the VCG and the regenerated VCs must be 
proved. Slight changes in the program might change the form of 
several of the formerly provable VCs and these must all be 
reproved even if the change is slight. 

Several approachs alternative to Verification Condition 
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Generation have been proposed. These attempt to narrow the gap 
between program and proof in order to avoid the above loop. In 
this chapter we describe some very tentative work we performed in 
trying to adopt one of these newer approachs, the programming 
logic approach, to the asymptotic paradigm. 

4.1 The Programming Logic Idea 

The underlying philosophy of the programming logic approach is 

that reasoning and correct programming are the same process. 

Traditionally these two activities have had their separate 

* 

languages: reasoning has been done in classical first-order 

logic., and descriptions of algorithms in the plethora of 
programming languages. The ultimate goal of the programming 
logic approach is to find a single formal language which 
facilitates both logical reasoning and algorithmic description as 
a single activity. A programming logic is a single language to 
meet both demands. 

Prior to the twentieth century there was less of a distinction 
between programming and proving since only constructive methods 
were allowable in proofs. The distinction between constructive 
and non-constructive proofs is that in the former when one claims 
that some exists one must actually exhibit it whereas in the 
letter existence can be shown through indirect means such as 
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reductio ad absurdum. Let us illustrate this point using the 
following non-constructive proof. 


Theorem: There are two irrational numbers a and b such that a 
raised to the power of b is rational. 

Proof: Consider SQRT(2). We know that SQRT(2) is irrational. 
Now let x=SQRT(2) A SQRT(2). I s x rational? If it is, then the 
theorem is proved by letting a - b = SQRT(2). Otherwise, consider 
x * SQRT( 2 ) - SQRT( 2) *2 = 2. That is certainly rational, so in 
thxs case the theorem is proved by letting a = x and b - SQRT(2). 

This finishes the proof. In either case we have found a and b 
satisfying the theorem. QED. 

Using contemporary standards of correctness this is a valid 
proof. But the naive student usually says: Where’s the Beef? 

Where are the a and b that you promised me? In fact, the really 
hard question is which of the two alternatives in the proof is 
true (it's the second; a very deep theorem in number theory shows 

that a*b is transcendental when a and b are algebraic and b is 
not a rational.) 


The 
view ; 
logic, 
of 


above proof is not acceptible 
indeed it can not even be 

In such constructive logics 


from a constructive point of 
made in a formal constructive 
one can extract from a proof 


all x (some y R(x, y)) 
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a function f given by a term in the language such that 


all x R(x, fx) 

is also provable. Constructive logics are actually rather close 
to programming languages. The programming logic approach to 
verification is to formulate a constructive logic whose terms can 
be evaluated by an interpreter. The scenario is then the 

following: If one is required to program a function f with the 
specification 

all x R(x, fx) 

one proves the mathematical theorem 

all x (some y R(x, y)) 

in the constructive logic. Since the logic is constructive the 
proof checker will extract automatically from the proof an 
algorithm for computing y from x and the interpreter will 

calculate this algorithm on any input supplied. We thus have a 

program and a proof of correctness in the same text. The 

algortilim extracted from the proof can be compiled and stored in 
a library for future use. On the other hand, the extracted 
algorithm may not be intelligible to humans. Programming logic 
enthusiasts hold the tenet that the proof from which the 

algorithm was extracted is the print form of the algortihm. To 
ask to look at anything else is like asking to look at binary 
code. Thus we see that in this approach programming is proving. 
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ML (which stands for Meta Language) is a programming language 
designed specifically for the purpose of developing formal 
systems and formal logics. It is described in the next section. 
ML will be the language in which we develop the ideas of a 
programming logic. This is done for three reasons. First, ML is 
designed for the very activity of developing logics. Second, 
fixing a particular programming language allows the discussion to 
present concrete examples. Finally, ML is a language which has 
more than a few similarities with a programming logic. 

After the introduction to ML, the programming logic approach 
will be presented in four stages: 

1. A simple fragment of constructive propositional logic is 

developed in ML. This is primarily to illustrate how logics 
are represented in ML. 

2. A programming logic for integer arithmetic is described in 
ML. 

3. An interpreter for this logic is developed which defines 
its semantics as a programming language* 

4. Preliminary work towards incorporating the asymptotic 
paradigm into the programming logic approach is presented. 
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4,2 The Programming Language ML 


) The programming language ML was originally designed by Robin 
Milner for use as the meta-language in the LCF verification 
system. The book [1] contains a detailed presentation of one 
variant of ML. Besides its use in LCF the language is interesting 
in its own right and is versatile enough to compete with other 
non-imparative languages like LISP and PROLOG. Although similar 
to LISP it contains features lacking in the former which are felt 
by many to be important in a modern programming language; for 
example, it is strongly typed and has a readable syntax. Since 
it is primarily a research tool, ML has not been standarized. 
Here we follow the syntax of the UNIX version of ML written by 
Luca Cardelli at Bell Laboratories. 

The distinguishing characteristics of ML are 

- interactive dialog; 
strong type system; 

- functional style; 

- exception- trap mechanism; 


1. Gordon, Milner and Wadsworth, Edinburgh LCF . Lecture Notes in 
Computer Science 78, (Springer-Verlag : Berlin, 1979). 
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- abstract data type defining mechanism; 
separately compiled modules. 

Like LISP, ML is an interactive programming lanugage. An ML 
session consists of a dialog between the user and the system. 
The user enters expressions terminated by a semicolon. Typing a 
carriage return sends the line to the interpreter which keeps 
accepting input lines until a complete expression is found. The 
value of the expression and its type is returned by the ML 
system. ML prompts the user to input an expression with , and 

responds on the following line which begins with "> M . Here are 
some examples: 

- (3 + 5) * 2; 

> 16 : int 

- "this is a string"; 

> "this is a string" : string 
-3,4,5; 

> 3,4,5 : int * int * int 

- if 1=2 then 3 else 4; 

> 4 : int 

- [3,true;5,false] ; hd [1;2;3;4]; 

> [ (3 , true) ; (5, false) ] : (int * bool) list 
1 : int 

These examples give some idea how integer, boolean and string 
constants are used in simple expressions. The last two 
expressions were typed in to the ML system on the same line. 
Notice that elements of a list are of the same type, are 
separated by semicolons, and are enclosed in square brackets. 
The empty list is "[]". If "t" is the type of the elements then 
t list is the type of the list. What then is the type of "[]"? 
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It is " 'a list" where 'a is a type variable. This illustrates 
ML's polymorphism discussed in more detail later. Another 
example of a polymorphic object is "hd" used above for the head 
of a list. Its type is " 'a list -> 'a". 

Elements of tuples like "1, 2, true" are separated by commas. 
They needn't be enclosed in parantheses. The type of a tuple is 
the cartesian product of the types of the elements. The 
cartesian product type operator is in ML so that the former 
tuple has type int * int * bool. "(1, 2), true" would have type 
(int * int) * bool which is different. 


Next we illustrate how variables are bound to values. At the 
top level this is done using the keyword "val". but there are 
various ways of making local bindings as well. 


- val a = 3 ; 

> val a = 3 : int 

- val a = 4 and b = 27,true; 

> val a = 4 : int 

| val b = 27, true : int * bool 

- val a,b = 4, (27, true); 

> val a = 4 : int 

| val b = 27,true : int * true 

- a + 6 where val a = 5 end; a; 

be no 

> 11 : int 

4 : int 

- let val a = 5 in a+6 end; 

> 11 : int 


{ Bind the value 3 to a } 
( Make two bindings } 

{ N.B. there will 
global change in a } 


The let" and "where" constructs accomplish the same 
purpose — abbreviating a local value. Such a local binding has no 
effect on the global value of the variable. Notice that comments 
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are enclosed in curly braces. 


ML is a functional programming language. Functions in ML are 
created using the keyword ’'fun". (LISP uses "LAMBDA" for the same 
purpose.) Functions are first class objects; they can be 
arguments to other functions and can be returned as values of 
functions. Functions can be bound to variables at the top level 
just like any other value: "val f = (fun x. b). Here b is some 
expression usually contain x. This way of defining the function 
f is given an alternate syntax keeping with the customary way of 
writing definitions: "val f(x) = b". This is completely 
equivalent, it only binds f using x as means of expressing the 
return value as a function of the input. 
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- val f (n) = n + 1; f(2); 

> val f : Int -> int 
3 : int 

“ val g (x,y) « (x+y) div 2; g(f(7),2+2); 

> val g : (int * int) -> int 
6 : int 

- val rec fact (n) ** if n=0 then 1 else n*fact(n— 1); 

> val fact : int -> int 

; i:\ * where vai 8 (h) (n) - h< "> eni 

- f(2)(3); 

> 2:int 

- val g (x, y) . f(x)(y); 

> val g :int * int -> int 

- g(2, 3); 

> 2:int 
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The alert reader will notice that the ML interpreter not only 
evaluates expressions but assigns types as well. It does this 
usin an internal pattern matching algorithm. Consider the 
definition of fact given above. ML decides on the basis of the 
right hand expression that fact is of type int -> int (note that 
the variable n is not declared to be of type int in this 
definition of fact. The user can give type restrictions as in 
the binding o£ f where n is declared to be of type int. Note 
that ML will figure out that g in the local binding on this line 
is of type (int — > int) — > (int — > int). If one had written 

- val f (n) = g (fun x.n) where val g (h) (n) = h(n) end; 
then ML would return 

> val f : ' a -> ( ? b -> 'a) 

where ’a, 'b are type variables. Such type polymorphism is a 
unique feature of ML. Where the type of an argument does not 
matter, it need not be specified. Hence one need not define an 
identity function exclusively for integers and one for string. 
One identity function will do — one for any type. 

- val f (x) = x; 

> val f : ’a -> 'a 

- val swap (x,y) = (y,x); 

> val swap : (’a * ’b) -> ('b * ’a) 

- val comp (f,g) (x) = f (g (x)); 

> val comp : ((’a -> ’b) * (’c -> 'a)) -> (*c -> ’b) 

Type variables always begin with a single quote in ML. 
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Expressions in ML can raise exceptions and then trap them. 
This is similar to the catch and throw mechanism in LISP. A 
function, instead of returning a value, may signal some abnormal 
condition. For example, the built-in division function signals 
division by zero whenever the divisor is zero. Further execution 
halts and ML prints a message to the user at the top level. 

- 1 div 0; 

> Exception: div 

- val f (n) = if n<0 then escape "Neg arg" else fact (n); 

> val f : int -> int '* 

- f (3) ; f (-3); 

> 6 : int 

Exception: "Neg arg" 

Exceptions can be trapped before they reach the top level. This 
permits the computation of an alternate value for an expression 
should it raise on exception. The syntax of the trapping 
mechanism calls for a question mark after the expression that may 
signal an exception and before the alternate expression to be 
evaluated in the event an exception is signaled. 

- (1 div 0) ? 45; 

> 45 : int 

<> val 8 ^int ->"int * this uses defini tion of f above } 

- g(3); *g(-3) ; 

> 6 : int 
6 : int 

ML has the capability to define new types. This can be done in 
two ways. The functions that construct the elements of a new 

type can be specified. This is a concrete type. An abstract 
type is defined by giving the constructors as well, but then all 
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the functions that will ever make use 
defined on the spot and explicitly 


of the constructors must be 
exported out of the type 
definition. The constructors are not available outside the scope 
of the type definition. This ensures controlled access to the 
type. We first give examples of a concrete types. 

- type Color = Red | Blue | Yellow; 

> con Red ; Color 
con Blue : Color 
con Yellow ; Color 

- type rec Tree = Leaf of int | Node of Int * Tree * Tree; 

> con Leaf : int -> Tree ’ 

I con Node : (int * Tree * Tree) -> Tree 

Tree is an example of a concrete recursively defined type. This 
concept encapsulates the use of pointers which are not otherwise 
availble to the user. The functions "Leaf” and '’Node*' are 
constructors of type "Tree", since through them elements of type 
"Tree" are created. The tree consisting of a single leaf is 

created by the function "Leaf". This is the only way to create a 

tree without having already made other trees. "Red", "Blue" and 
Yellow are constructors as well; they require no arguments. 
The type "Color" would be called an enumerated type in Pascal. 


An important part of the ML language is pattern matching. This 


is often 

used in conjunction 

with 

the case 

statement 

to 

break 

apart the 

structure of a type. 

An 

element of 

a type 

is 

taken 

opart in 

the case statement. 

It 

is matched 

against 

the 

pattern 


consisting of variables and constructors in each branch of the 
case statement. This is how destructuring is accomplished and 
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explains the lack of "destructors" or "selectors" in the 

language. We give several examples of this using the definition 
of the type "Tree" above. 


- val f (t: Tree): int = 
case t of 

Leaf n . n | 

Node (n, Leaf Leaf _) . 

Node (n, Node Leaf _) . 

Node (n, _) . n; 

> f : Tree -> int 


{ A silly example. ) 


n 

n 


{ A leaf } 

{ A tree with two leaves ) 
{ A skewed tree ) 

{ Everything else } 


- val rec EqTree (t: Tree, s: Tree): bool = 
case (t,s) of 

(Leaf n, Leaf m) . n=m I 

(Node (n,tl,t2), Node (m,sl,s2). 
if n=m 

then if EqTree (tl,s2) then EqTree (t2,s2) else fal 
else false | 

(_,_) . false; 

> val EqTree : Tree * Tree -> bool 


se 


The principle features to notice about pattern matching are that 

variables are bound and that "_" is the wildcard pattern matching 
any pattern. 


The next example is an abstract type definition of a tree. The 
constructors "Leaf" and "Node" will not be available outside the 
scope of the abstract type definition. 


- abstype rec Tree = Leaf of int | Node of int * Tree * Tree 
with val MakeLeaf (n) « Leaf n; 
val MakeNode (n,tl,t2) = Node (n,tl,t2); 
val Label (t) = 
case t of 
Leaf n. n | 

Node ( n , 1 1 , t2 ) . n; 
val RightSubTree (t) = 
case t of 

Leaf n. escape "Leaf" | 

Node (n,tl,t2). t2 
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end ; 

> abstype Tree 

val MakeLeaf : int -> Tree 

val MakeNode : (int * Tree * Tree) -> Tree 

val Label : Tree -> int 

val RightSubTree : Tree -> Tree 

The scope of the abstract type definition extends from the ML 
keyword "with” to the closing keyword "end". Within this scope 
the constructors "Leaf” and "Node" are available and have been 
used to define other functions to make elements of type "Tree" 

and, with the help of the case statement, to take apart elements 
of type "Tree", 

Defining a type abstractly and exporting only certain functions 
of the constructors is useful when one is interested in certain 
subtypes. For example consider balanced trees: 


- abstype rec 
with 


BalTree = Empty | Leaf of int | 

Node of int * BalTree * BalTree 


val Null = Empty; 
val MakeLeaf = Leaf; 

val MakeNode (n, tl, t2) = if height tl = height t2 then 

Node (n, tl, t2) else escape "Not Balanced" 
where val rec height(t) 
case t of 
Empty. 0 1 
Leaf(_).l | 

end* N ° de ^-’ tlf t2 )* max (height(tl) » height(t2); 

end ; 


The Null, MakeLeaf, and MakeNode functions which are exported 
from this abstract type definition will only permit the user to 
construct balanced trees. Note that MakeNode will allow the user 
to make an unblanced tree from tl and t2 if either one of these 
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were already unbalanced (since only their heights are checked not 
their individual symmetry) but that means the user could make an 
unbalanced tree if he already had an unbalanced tree. Since Null 
and MakeLeaf give only balanced trees we see that it is true that 
the user can make only balanced trees. 


4,3 Representing a Logic in ML 

The following ML program implements a portion of propositional 
logic. We assume the type Proposition has already been defined 
in ML. One way to do this is to introduce Proposition as an ML 
type using a constructor which turns identifiers into 

propositions: 

—type Proposition = PropCon of string; 

> type Proposition = PropCon of string 
I con PropCon : string — > Proposition 

We first define what the formulas shall be. Formulas are a 

data type in ML, and their definition follows the usual one in 

mathematical logic: Every proposition is an atomic formula, and a 
pair of formulas can be made into another formula using the 

implication connective. Of course, we might be interested in 
other connectives or at least in a formula to represent 
falsehood, but implication shall suffice for this example 
(although the resulting logic is not complete) . 
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- type rec Formula =* 

AtomicFormula of Proposition I 
Imply of Formula * Formula; 

{ A Proposition is a Formula; out of 2 formulas, make implication.} 
> type Formula = ... 

con AtomicFormula : Proposition -> Formula 
con Imply : (Formula * Formula) -> Formula 

Next we need some functions to manipulate formulas: to extract 

the hypothesis and conclusion subformulas from an implication and 
a function to test for syntactic equality. 


{ Get conclusion of implication. } 


‘ Va Jase yP ? t or 1S (f> ” < Get hypothesis of implication. 

T^i 1CF ? PmU:1 fcT esca P e "hot implication” I 

Implication (fl,f2). fl; 1 

> Hypothesis : Formula -> Formula 

- val Conclusion (f) = 
case f of 

AtomicFormula escape "Not implication" I 

Implication (fl,f2). f 2 ; 1 

> Conclusion ; Formula -> Formula 

' case r (flf?2) r o? la £2) ‘ ' Syntactic equality of formulas. 

AtomicFormula a, AtomicFormula b . a=b I 
Implication (hi, cl), Implication (h2,c2) . 

(_, ) . E false“ la (hl,h2) then E( l Fo rmula (cl,c2) else false | 


> EqFormula : (Formula * Formula) -> bool 
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Finally we define the calculus of propositional logic by 
defining an abstract type representing proofs. An element of 
type theorem can be constructed only as an instance of one of the 
two axioms or os a result of modus ponens applied to theorems. 
The two axiom schemes (written using conventional notation are 

K:p -> (p -> q) 

S:(p -> (q -> r)) — > C ( P -> q) -> (p -> r)) 

yhere p , q , and r are any formulas. Modus ponens yields q when 
applied to p -> q and p* 

In order to actually give the ML definition of the abstract 
data type representing proofs, a useful auxiliary function must 
defined along the way. This function "ProofOf" takes an 
element of type theorem and returns the formula of which the 
theorem is a proof. 
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- abstype rec Thm = 

AK of Formula * Formula | 

AS of Formula * Formula * Formula 
MP of Thm * Thm f 

with 


{ K axiom is represented. } 
I { S axiom is represented. } 
modus ponens is represented. } 


val AxiomK( p , q)=AK( p , q) ♦ 
val rec ProofOf thm = 
case thm of 


val AxiomS(p,q,r)=AS(p,q,r) 


AK(p,q). Imply (p Imply (q,p)) 

^^vPtQ>r), 


M0^; I ? Pl 0N Im o ly(p,Imply Imply(Imply(p,q), 

WP(tl,t2). Conclusion (ProofOf tl); 


Imply ( p , r ) ) ) | 


val ModusPonens (tl,t2) = 

if EqFormula (Hypothsis (ProofOf tl) 
then MP(tl,t2) else escape "Fail" 

end ; 


ProofOf t2) 


> abstype Thm 

val AxiomK : (Formula * Formula) -> Thm 

val Axioms : (Formula * Formula * Formula) -> Thm 

val ProofOf : Thm -> Formula 

val ModusPonens : (Thm * Thm) -> Thm 


This completes a formalization of a fragment of propositional 
logic. An element of the ML data type "Formula" represents a 
formula in propositional logic. An element of "Thm" actually 
represents a proof — a bonafide proof. By virtue of the fact that 
ML. certifies an element is of type "Thm", the element 
represents a proof in the propositional calculus, because the 
only way such an element can be produced is to use one of the 
three constructors. Each constructor represents a valid step 
proof in the propositional logic. An element of type, "Thm" can 
be created no other way. A study of the above example will show 
that one is relying on the ML type encapsulating mechanism with 
its control of exported constructors to simplify the construction 
the logic. For example, the only objects prex of type Thm 
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which the user can construct really are proofs of ProofOf (prex) 
because of the exception raised when ModusPonens is applied to 
arguments not of the appropriate form. 

Using the data type definition given in the previous paragraph, 
we can now prove that a proposition P implies itself. We can be 
sure that it is a theorem of the propositional calculus because 
it is an element of the ML type "Thm''. 

✓ 

The usual way P implies P is proved from these axioms is by the 
following informal proof 


1. [p ~> C ( P ->p) -> p]] -> [ [ p -> (p -> p)] -> (p -> p)] Axiom S 
2* p -> [(p ->p) -> p] Axiom K 

3. [[p -> (p -> p)] -> (p _> p)] Modus ponens 1. and 2. 

4. p -> p Modus Ponens 3. and 2. 


Formally we make the following bindings in ML. 

- val prexl = AxiomK (P,Imply(P, P)); 

{ P->[(P->P)->P]} 

> - : Thm 

- val prex2 = Axioms (P, Imply(P,P), P) ; 

{[P -> [ (P ->P) -> P]] -> [[P -> (P -> P)] -> (P -> P)]} 

> - : Thm 

- val prex3 = ModusPonens (prex2, prexl); { [P->(P->P) ]->(P->P) } 

- val prex4 = ModusPonens (prex3, prexl); { P->P } 

> - : Thm 


Now typing ProofOf (prex4) to the interpreter will yield the ML 
representation of the formula P->P as a response. Notice that 
this proof is for a particular element of ML type Proposition 
(which we have been denoting P). For this element of type 
Proposition we could use the ML object PropCon ("P”). or Imply 
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(Propcon (■ Q"), P rop Con, (•»•»,■ or any one of a number of 

othera. The proof vorka for any element of type Proposition. 

Furthermore, if we replace all occurrences of that particular 

proposition P by an ML variable, say x. of type Proposition in 

the proof, we get a parameterized proof, call it prez4' . Then I 

(fun x. prex4 ) is a function which mops any Proposition into 

a proof that it implies itself. The function I is a derived rule 

o£ inference. This illustrates how ML acts as a useful 
meta-language . 


— ■* The Programming Logic for Arithmeti, 


Now we describe a formal system for reasoning and programming 
over the integers. Our formalism will include the reals but we 
-ill give no axioms for this sort in this section. We describe 
its syntax in a manner similar to what was done above for the 
fragment of the propositional logic. Its rules are taken 
directly from constructive predicate calculus and Peano 
arithmetic together with a finite type hierarchy. 

irst the sorts are defined as a recursive data type, Sort. NN, 
PH and Prop are basic sorts and Arrow(sl. s2), Cross(sl, s2) are 
sorts when si and s2 are. *rrow(sl, s2) is the sort of functions 
from sort si to sort s2. Thus our former sort NNSeq is ArrowfNN, 
NN) and RRSeq is Arrow(NN, HR). We could have used a full simple 
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type theory in our previous discussion but we didn’t find it 
necessary. It is simple to do it here since ML’s recursive type 
constructor facility strongly suggests it. Cross(sl, s2) is the 
Cartesian product of sorts si and s2. 

- type rec Sort = NN | RR | Prop | 

Arrow of Sort * Sort | 

Cross of Sort * Sort; 

Terms are defined os an abstract recursive type. The context 
sensitive part of the definition which corresponds to sort 
checking needn't be considered when declaring the basic type 
constructors; this semantic information is captured by having the 
exported functions raise an exception when their inputs are not 
of the right sorts. This kind of failure is detected by using 
the SortOf function which is defined recursively over the terms. 

Informally, variables of a fixed sort are terms; if tl is a 
term of sort Arrow(sl, s2) and t2 is a term of sort si then 
Application (tl, t2) is a term of sort s2, it is the result of 
applying the function tl to its argument t2; if t is a term of 
sort si and x is a variable of sort s2 then Abstraction^, t) is 
a term of sort Arrow(s2, si), it is the lambda abstraction which 
yields a function which assigns t[t'/x] to objects t' of sort s2 
where t[t'/x] is the result of replacing x in t by t' ; if tl and 
t2 are terms of sort si, s2 then Pair(tl, t2) is a term of sort 
Cross(sl , s2 ) ; if t is a term of sort Cross(sl, s2) then First(t) 
is a term of sort si and Second(t) is a term of sort s2; Zero is 


4-22 



a term of sort NN; SuccFupc is a term of sort Arrow(NN, NN); FF 
is a term of sort Prop; Imply, Or, and And are terms of sort 
A rrow( Cross (Prop , Prop), Prop); if x is a variable and t is a 
term of sort Prop then Some(*, t) and All(x, t) are terms of sort 
Prop; if tl and t2 are two terms of the some sort then Eq(tl, t2) 
is a term of sort Prop; if tl is a term of sort Prop and t2, t3 
are terms of the same sort s then If(tl, t2, t3) is a term of 
sort s; if tl is a term of sort s and t2 is a term of sort 
Arrow(Cross(NN, s), s) then Rec(tl, t2) is a term of sort 
Arrow(NN, s). We leave out all constants, functions and 
relations over RR. They will play no role in this section; when 
they do come in later sections, they will be treated informally. 

Using these basics we can introduce definitions using ML. 
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- val Succ (x) *= if SortOf(x)=NN then 

Application(SuccFunc , x) else escape 
"Not Natural Number"; 

- val One «= Succ(Zero); 

- val Imp (x , y) = if SortOf (x)=Prop and 

SortOf (y)=Prop then 
Application(Imply , x, v) 
else escape "Not Props"; 

- val TT = Imp ( FF , FF); 

- val Neg(x : Term ) = Imp(x, FF); 

The recursion operator, Rec, is used to introduce functions by 
recursion. If tl is of sort si and t2 is of sort Arrow(Cross ( NN , 

si), si)) then Rec(tl, t2) is the function g of sort Arrow(Nat, 
si) given by 

g(0) = tl 

g(n + 1) = t2(n, g( n) ) . 

Suppose one wants to define plus using a the Rec recursive 
operator. The basic equations are 


Plus(a, 0) = a 

Plus(a, n + 1) = Plus(a,- n) + 1. 

Suppose x and y are variables of sort Cross(NN, NN). Then 

Abstraction(x, Rec(First(x) , Abstraction (y , 
Succ(Second(y))))) 

is a term of sort Arrow(Cross(NN , NN), NN) which defines the term 
Plus. 
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Formulas are terms of sort Prop. We now want to define a 
constructive calculus with which we can derive true facts about 
arithmetic. Our calculus is a natural-deduction style calculus 
with introduction and elimination rules. The simplest rules are 
the introduction and elimination rules for »FF». Written out in 
standard natural deduction style, they look like this: 


FFIntro : A & neg A 

FFElim: FF 


- FF 

- A 


If you have a proof of the formula before the »|-», then an 
application of the rule yields a proof of the formula after the 

Th<i f ° rmUlas < th «e can be more than one, even no formulas 
at all) befre the ar e called the hypotheses of the rule, and 

the formula (there must be exactly one) after the is the 

conclusion. A rule with no hypothesis is call an axiom. 

These natural deduction rule«: (nir* a 

uies (like modus ponens in the 

propositional calculus example) are represented in ML as 

constructors of fvno ti 

type Thin . Elements of type ”Thra” are called 

proof expression. The "FFIntro” constructor represents the 

false introduction rule. It takes an argument which must be a 

proof of a contradiction and the result is a proof expression 

proving false. Surprisingly "FFElim” needs two arguments. 

Besides the proof of false, "FFElim” needs the formula A as an 

argument to indicate what the proof expression proves. Thus 

FFElim (prex: FF,A)” is a proof expression proving A (where by 

prex. FF we mean prex is a proof expression proving FF). The 
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constructors "Andlntro" , ’lAndElimR" and "AndElimL" represent the 
following rules of the calculus: 

Andlntro: A, B - A & B 

AndElimR: A & B - A 

AndElimL: A & B - B 

These constructors will fail (like modus ponens in the 

propositional calculus example), if the arguments are not in the 
form that the rule prescribes. "Andlntro" is a constructor that 
takes two arguments, proof expressions, and forms a proof 
expression of the conjunction of the arguments. "AndElimR" is a 
constructor that takes one argument, a proof expression proving a 
conjunction, and is a proof expression of the left conjunct. The 
constructor "AndElimL" is similar. For disjunction there are two 
introduction rules and one elimination rule. The introduction 
rules "OrlntroR" and "OrlntroL" look like: 

OrlntroR: A - A or B 

OrlntroL: A - B or A 

The constructor "OrlntroR" has two arguments: a proof expression 
which proves A and a formula B. Together these supply all the 
information necessary to form a proof expression of the 
disjunction. 

The "OrElim" is slightly more complicated and takes three 
arguments. The first must be a proof expression of a 

disjunction; the other two arguments must be proofs of 
implications with special forms. 
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OrElira : A or B, A -> C, B -> C |- C 

The proof of an implication in a natural deduction style 
calculus requires assuming A, then proving B, and discharging the 
assumption A to conclude A -> B. In our calculus we do this using 
the "Assume" construct and the "Implntro" rule as follows: 


- val hyp = Assume ("hyp". A); 

• • • 

- val prexl = ... hyp ... ; 

- val prex2 = Implntro (hyp, prexl); 


{ Assume A. } 

{ Derive a proof of 
{ A proof of A-> B. 


say, B. ) 


The assumption (hyp in the example above) 
discharged in the proof. Finally, there is 
elimination rule "ImpElim" which is just the 
modus ponens. 


is said to be 
the corresponding 
familiar rule of 


ImpElim: A -> B, A |- B 


With the rules governing the basic connectives 
we have primarily rules for the quantifiers and 
over. We list here the rules and axioms 
self-evident. 


out of the way, 
arithmetic left 
that are most 


Truth 

PeanoPostulate7 succ (n) = succ (m) 
Pea no Postulate8 

Alll„t r ° P(xO) 

AHElim All x . P( x ) 


True 

n=m 

~succ (n) = 0 
All x . P(x) 
P(t) 


The all-introduction rule has the usual constraint that the 
variable x is not free in any undischarged assumptions. 
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Bear in mind that although these rules and axioms are presented 
m their familiar mathematical form, we take them as definitions 
of constructors in the representation of the logic in ML. These 
rules all have straightforward representations in ML. 


Three rules are more complicated: "some" introduction, "some" 

elimination, and induction. First we summarize their basic 
form. 


Somelntro 

SomeElim 

Induction 


P(t) 

Some x . P(x), P(xO) -> Q 
P(0). P(n) -> P(Succ (n)) 


- Some x 

- Q 

- All x 


P(x) 

P(x) 


The "Somelntro" proof expression constructor actually requires 
three arguments. The first argument is a "some" formula. This 
provides the formula to be proved, since determining it from 
P(xO) is not trivial. Also the formula indicates the name of the 
bound variable which may be convenient for renaming variables* 
The second argument is a proof expression. It must prove the 
scope of the "some" formula P for a particular term. The third 
argument must be the particular term xO for which one showed in 
the second argument that P(xO). Put together the use of the 
"Somelntro" constructor looks like this: 


Somelntro (Some(x,P), prex:P(xO), xO) 

SomeElim" also has three arguments: the proof expression of 
some existentially quantified formula, the variable used to refer 
to the term postulated, and the proof of an implication with the 
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appropriate hypothesis. 


SomeElim (prexl: Some(x,P), xO, P rex2:P(xO) -> Q) 

The implementation of the ,, SomeElim ,, contractor must not overlook 
the usual constraint on the use of the rule: namely, xO can't 
occur free in any undischarged assumptions of prex2. 

"Induction" has four arguments: the "All" formula to prove, the 
base case P(0), the induction variable, and a proof of the 
implication P(n) -> P(n'). 

Induction (All ( m , P) , prexl : P(0), n, prex2 : P(n) -> P(n’)) 

The variable n must not occur free in any undischarged 
assumptions of prex2. 

Now corns another set of rules, called the computation rules. 
First there is be ta— reduction . The proof expression 

BetaReduction (Abstraction (v, b), t) 
is a proof expression proving the following formula: 

Application (Abstraction (v,b),t) = b[v/t] 
where t is free for v in b. Strictly speaking the formula is: 

Eq ( Application (Abstraction (v,b),t) , Subs (b,v,t) ) 
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The remaining rules are for the Rec and If constructs. The proof 
expression 


is a proof of 


BaseCase (Rec(b,i)) 


Application (Rec(b,i), Zero) = b 

and 


IndStep (Rec ( b , i ) , x ) 

is a proof of 

Application (Rec(b,i), Succ x) = 

Applict j on (i, Pair (x. Application (Rec(b,i)), x)) 

For the If construct there are two rules. The proof expressions 

Truelf (prex: P, If(P,t,s)) 

Falself (prex: Not (P), If(P,t,s)) 

are proofs of 


respectively . 


If(P,t,s) = t 
If (P,t,s) = s 
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Finally, the rules concerning equality should be mentioned. 
They are the typical rules one would expect. 


Ref lexivity 

Symmetry 

Transitivity 

Congruence 

Substitution 

EqualityElim 


n=m 
n=m , ra=p 
f =g» x=y 
x=y, P(x) 
P=Q, Q 


- n=n 

- m=n 

- n=p 

- f(x)=g(y) 

- P(y) 

- P 


Now that we have the syntax of a sample programming logic, it 
is time to given an example of a proof. We give a proof of the 
following theorem 


All x :NN (x=0 or Some y . Succ (y) = x) 

It has a very simple proof by induction which is built up as 

follows where Variable constructs a variable given a string and a 
sor t . 


- val x , y = Variable ("x", NN) , Variable 

- val SOME (term) = Some (y, Equal (Succ 

- val EQO (term) = Equal (term, Zero); 

- val ALL = All (x, Or (EQO (x), SOME (x)); 


(V. NN); 

y, term)); 


(We have surpassed the ML response to these lines since it does 
not add to the discussion.) This first line declares two 
variables for use in the proof. The reminaing lines make 
abbreviations used to make formulas that will come up in the 
proof. The theorem to be proved is expressed as the last of 
these formulas, “ALL". The base case of the induction requires 
proving the the formula "Or (EQO Zero, SOME (Zero))" which is 
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proved easily by reflexivity. 

- val base = OrlntroL (Reflexivity Zero, SOME (Zero)); 


The induction step requires more work. The induction hypothesis 
is not required in any material way in the proof of "Or (EQO 
(Succ x), SOME (Succ Zero))". 


- val indhyp = Assume ("indhyp", ALL); 

- val prexl = Reflexivity (Succ x); 

- val prex2 = Somelntro (SOME (Succ x)), prexl, 

- val prex3 = OrlntroR (EQO (Succ x), prex2); 

- val indstep = Implntro (indhyp, prex3); 

- val proof = Induction (ALL, base, x, indstep) 


x); 


In the last line, the base case and the induction step are 

combined to yield a proof expression corresponding to the desired 
proof . 


4.5 Constructive Mathematics 

Thus far nothing should seem very unusual. We have described a 
theory with its language and rules. So we know how to make 
proofs in the system. In a programming logic on the other hand, 
the goal will require proving a different kind of theorem. 
Typically one will want to prove a theorem of the form "for all 
x there exists y". Then by the nature of a programming logic, 
the interpreter when supplied with the proof and a particular x 
will produce a y with the desired property. Thus to program the 
maximum function, one would just prove a theorem in a programming 
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logic. For instance, 


All x , y . Some z . (z=x or z=y) & z>=x & z>=y 

Suppose the proof of this theorem was called ’’MaxThm". Then by 

using the interpreter to evaluate the expression formed by 

applying MaxThm to two integers will result in integer with the 

desired property. In order to write this interpreter some care 

is needed in formulating the rules of the programming logic. 

Fortunately, we can draw on the experience of the constructive 

schoo] of mathematics for help in devising these rules. Their 

criticisms of classical mathematics provide insight to the 
problem. 

In particular we have avoided certain axioms that are taken for 
granted in non-constructive logics. Typically the "law of 
excluded middle ’ 1 or the axiom A & neg A is used freely. By v.ot 
including this axiom, many formulas held to be true will not be 
provable. For example, let F stand for the statement of Fermat’s 
last theorem. Some mathematicians hold that F or neg F is a true 
formula and they appeal to the law of excluded middle. We have 
rejected this axiom in the programming logic, because of its lact 
of constructive content. Consider for the moment the possiblity 
of adding this axiom. Represent the axiom by a constructor 
’’ExcludedMiddle" which requires one formula as an argument. Thus 
ExcludedMiddle (F) is a proof expression proving that Fermat’s 
theorem is true or false. Consider what happens when this proof 
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expression is used in conjunction with or— elimination • 

OrElim ( Exclud edMiddle (F), easel, case2) 

Suppose that easel evaluates or reduces to 1 when given a proof 
of Fermat's thoerem and case2 reduces to 2 when given a proof of 
its negation. In either case the proof expression reduces to a 
natural number. But which one? The interpreter can not figure 
out which. In a programming logic all expressions which can 
reduce to a natural number, do reduce mechanically to a number in 
canonical form (like 45 or 17). (The interpreter or the evaluator 

which performs the reductions is the subject of the next 
section . ) 

For the rules as we formulated them it is easy to see how we 
can justify the claim that evaluation will be mechanical. Since 
an "or" formula can be proved only by "OrlntroL" or "OrlntroR'' 
and thus the case is explicitly tagged, there will never be any 
problem deciding which case is true. It should also be apparent 
how a particular value can be computed, a "some" formula can be 
proved only by "Somelntro" and this requires a particular term to 
be supplied. This value will be used by the interpreter. 

4.6 An Interpreter for Arithmetic 

As in the programming language ML and LISP, the interpreter for 
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our arithmetical calculus will take an expression (a proof 
expression) and return a value. In our case the value returned 
is a simplified proof expression. Church's lambda calculus 
provides the rules by which LISP expressions are evaluated. The 
most important of these rules is beta reduction. Since we have 
ambda terms in our arithmetical calculus we expect to find at 
least the beta reduction rule. In fact there are many such 
reduction rules in our arithmetical calculus, and we will go 
through these reduction rules now. Later we will see exactly 
what expressions we actually have to enter to our system in order 
to evaluate programs like the maximum function or the 
substraction function. 

The rules for pairs are particularly simple. 

First (Pair (t,s)) — > t 
Second (Pair (t,s)) — > s 

The rules dealing with conjunction are similar. 

a nHri ( ,A n ^ ntr ° < prex1 ’ prex2 ) ) — > prexl 
AndElimL (Andlntro (prexl, prex2)) — > prex2 

Here is the reduction rule for beta reduction. 

Application (Abstract (x, b), t) — > b[x/t] 

The notation b[x/t] means that the term t is substituted for the 
variable x in the term b. If any free variable of t is captured 
the result is an error message. There is a rule for "AllElim" 
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proof expression. 

AllElim (All Intro (x, prex),t) — > prex[x/t] 

The rule for "ImpElim" is similar. 

ImpElim (Implntro (hyp, prexl ) , prex2) — > prexl [ hyp/prex2 ] 

This is an important reduction since it comes up as a part in the 
remaining rules. It is worth considering this reduction a little 
more closely . The "Implntro" constructor enforces that the proof 
expression "hyp" is in the form "Assume (name, A)" where A is 
some formula. The "ImpElim" constructor enforces that whatever 
proof expression its first argument is, it is a proof of A->B. 
1 his much follows from the definition of the proof expression 
constructors. Clearly, the proof expression "Implntro (hyp, 
prexl)" is one such proof expression proving A->B for some 
formula B. But there arc others, including some which are not 
necessarily implication-introduction expressions. The proof 
expression could be, for instance, a some-elimination or an 
or-elimination proof expression and still be a proof of A->B. One 
could say that the type of the first argument must A->B. Of 
course, if the evaluation is to continue these expressions of 
type A->B must evaluate to an implication-introduction expression 
so that the implication-elimination reduction rule can be 
applied . 

The reduction rules for "or", "some" and induction complete the 
list of reduction rules. 
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OrElim (OrlntroR (A, prexl), prex2, prex3) 

— > ImpElim (prex2, prexl) 

OrElim (OrlntroL (A, prexl), prex2, prex3) 

— > ImpElim (prex3, prexl) 

SomeElim (Somelntro (S, prexl, t), yO, prex2) 

— > ImpElim ( prex2 [ yO/t ] , prexl) 

AllElim (Induction (A, prexl, n, prex2), Zero) 

— > prexl 

AllElim (Induction (A, prexl, n, prex2), Succ x) 

— > ImpElim (prex2, AllElim (indue, x)[n/x]) 


The proof expression "indue*' in the last line is just the 
original induction expression: 


Induction (A, prexl, n, prex2) 


The role of the interpreter is to apply any of these reduction 
rules until none of them are applicable. We call this process 
normalization, simplification or reduction. 


With the basic normalizing procedure in mind, consider again 

the substraction example. There we had a proof expression of the 
basic form 


Induction (All (x, P), base, x, indstep) 

Normalization will produce no change in this proof expression. 
It is already in normal form. Most often this will be the case 
unless by oversight a proof with needless steps was done. A 
reasonable implementation of normalization can, however, provide 
one most important service, it can guarantee that there are no 
free variables or undischarged assumptions in the proof. So even 
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if a proof expression .does not simplify it is certified to 
represent a proof in our system. 

The interpreter is not limited to the role of proof checker. 
Consider the following proof expression still using the 
substraction example. 

AllElim (Induction (All x, P), base, x, indstep). Zero) 

The result of evaluating this proof expression is simply base or 

OrlntroL (Reflexivity Zero, SOME (Zero)) 

When we apply the proof by induction to "One" we get the 
following chain of proof expressions. 

— > IrapElim (indstep, AllElim (Induction (...), Zero ) ) [ x/Zero ] 
— > ImpElim ( i nd step [ x/Zero ] , OrlntroL (...)) 

> ImpElim (Implntro ( indhyp [ x/Zero ] , prex3 [ x/Zero 1 ) , 

OrlntroL (...)) 

— > prex3 [ x/Zero ] [ indhyp/OrlntroL (...)] 

— > prex3 [ x/Zero ] 

^ OrlntroR (EQO (Succ Zero), prex) 
where prex is the proof expression: 

Somelntro (SOME (Succ Zero), Reflexivity (Succ Zero), Zero) 

Applying the induction proof to "Two" yields a similar 
or-introduction proof expression; this time, however, prex is the 
proof expression: 
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Somelntro (SOME (Succ One), Reflexivity (Succ One), One) 

Thus, we have written a primitive program to subtract one from 
any natural number and have run the program on the first three 
natural numbers. Notice that all the information about the 
results of the subtraction are still around. We see that if 
subtraction does not apply (that is, subtracting one from zero), 
then the resulting proof expression was an or— introduction— left 
proof expression. If the substraction worked, the result was an 
or-indroduction-right proof expression. The result of 
subtracting one from the given quantity can be found buried in 
the some-introduction rule. It is the term given as a witness 
that there is number whose successor is the given value. 

There are two parts to the some-introduction proof expression: 
the witness, and the proof that the witness has some property. 
We see from the above example that we may want to throw out the 
proof part as being unimportant and actually pick out the 
witness. Suppose we had in our language a function ’’Witness” 
that evaluated os follows: 

Witness (Somelntro (S, prex, t)) — > t 

The function ’’Witness” could be used to ignore the proof part of 
a some-introduction proof expression. Finally we have all the 
mechanisms necessary to extract a function from a proof of the 
form "for all — there exists”. (Actually "Witness" is definable 
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from the existing constructors, but for as far as we are 
concerned" here, it can be taken as primitive.) 

The substraction theorem is an interesting case, because it is 
an example of a partial function. The value of the function at 
zero is avoided. Compare the formulation of the theorem as was 
given originally 

All x . (x=0 or Some y . succ (y) = x) 
with the alternate formulation: 

All x . Some y . ~(x=0) -> succ (y) = x) 

This alternate formulation can be prove by induction, but the 
proof is slightly more difficult. But suppose we hove a proof of 
it, call it "SubThm". The base case is vacuously true, but must 
be proved by some-introduction which requires a witness 
nevertheless. Say the base case was proved with 34 as the 
witness (any number will do, of course). Now consider the result 
of evaluating some proof expressions containing ’'SubThm". 

Witness (AllElim (SubThm, Zero)) — > ThirtyFour 

Witness (AllElim (SubThm, One)) — > Zero 

Witness (AllElim (SubThm, Two)) — > One 

The dicussion above should have given some idea as to how a 
programming logic is a programming language. One could easily 
make an interpreter that interacts with the user just like the ML 
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interpreter. There would be similar sort of dialog with the user 
typing in an expression and the interpreter returning the 
simplified form of the expression. For example, a dialog 
concerning ’’SubThm" might look like: 


- Witness (AllElim (SubThm, Zero)); 

> ThirtyFour : Nat 

- Witness (AllElim (SubThm, One)); 

> Zero : Nat 

- Witness (AllElim (SubThm, Two)); 

> One : Nat 


Thus we have seen that the interpreter in a programming logic 
is both a theorem checker (or, equivalently a program verifier) 
and a functional programming language evaluator. 


4.7 Incorporating the Asymptotic Paradigm 

In this section we raise some of the issues that arise when 
programming logic is extended to non-standard analysis. Our 
discussion is quite tentative. What we are aiming at is an 
extension of our previous constructive arithmetic calculus to 
some form of NSA. Our target system is best illustrated by the 
example given in the next section. 

We have already encountered one technical problem in using 
non-standard numbers: induction. The induction rule of 
arithmetic must be modified to exclude the proof of external 
formulas (those containing the std predicate). 
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But there are other difficulties as well. 


The form of the 


axioms in the programming logic for the real numbers remains to 
be worked out. Much mathematical research has been done in the 
area of constructive analysis ([2], [3], [4]). This work guides 
the attempts to formalize the reasoning concerning real numbers. 
The difficulty lies in that some of the ordinary axioms for real 
numbers do not have constructive content. So they have problems 
similar to the law of excluded middle discussed in Section 4.5. 
One such axiom for real numbers is dichotomy; another is that odd 
degree polynomials have a root. The first axiom asserts one of 
two things happen and the second asserts the existence of some 
quantity. Axioms in these two forms pose the difficulty. 

Consider for a moment dichotomy. We certainly want that x<=y 
or x>y for all real numbers. But if we take this as an axiom, 
then the interpreter will have to be able to decide for any real 
numbers which case holds. This is difficult for arbitrary real 
numbers. For instance, how is the interpreter to know if 
f(x)+2.3 is greater than g(x+y)/x? For example, consider the 
following proof expression: 


2* A. S. Troelstra, Metama thematical Investigation of 
Intuitionistic Arithmetic and Analysis, ( Springer-Verlag : BerlinT 
1973). 

3. Arend Heyting, Intuitionism : An Introduction , (North-Holland : 
Amsterdam, 1971). 

4. Errett Bishop, Foundations in Constructive Analysis, 
(McGraw-Hill: New York 7 ! 1967 ) . 
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OrElira (Dichotomy (x,y) f prexl, prex2) 

where "Dicho tomy ( x , y ) ,f is proof expression proving x<=y or x>y. 
The interpreter will in general be unable to figure out which 
branch to take. It is unclear at this point if this causes in 
problem in practice. 

One solution is to restrict the use of the axiom to values the 
interpreter can actually test. For given two floating-point 
numbers stored in the computer the interpreter can test them to 
find which is the larger. The reduction rule would then look 
something like this: 

if x<=y then 

OrE]im ( Dichotomy ( x , y ) , prexl, prex2) 

— > ImplElim (prexl, FACTl:x<=y) 
otherwise 

OrElim (Dichotomy (x, y) , prexl, prex2) 

— > ImplElim (prex2, FACT2:x>y) 

FACT1 and FACT2 are proof expressions proving that the 
appropriate relationship holds between x and y. These are axioms 
in a sense, but cannot be invoked by the user. 

Arbitrary arithmetic is also the problem with 
some-introduction. It may prove useful to identify a certain 
class of terms in NSA, call them the computable terms. This set 
of terms includes all the variables and constants, and is closed 
under the machine operations, ++, — , **, and //. Also in the 
list of operations which produce computable terms are Skolem 
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functions for existential axioms (we will encounter these 
shortly) and other functions defined with the Rec and If 
constructs as long they contain only computable terms. One can 
check syntactically if a term is computable. 

Since some-introduction requires the interpreter to actually 
compute the value of the witness, some restriction on the witness 
is to be expected. The natural candiates for witnesses are the 
computable terms. The same restriction must apply to 

all-elimination. 

Now we examine constructive content of another axiom of NSA . 
One of the basic axioms of NSA was that the range of the cropping 
function is finite. One possible way of formalizing this in the 
language is as follows. 

FIN1: Some i . f(i+l) >= f(i) 

FIN2: Some i . f(i+l) <= f(i) 

We will choose a slightly more convenient form of these axioms by 
naming a Skolem function which computes the desired point in the 
sequence . 

FIN1: f (FIN 1 ( f ) + l ) >= f (FIN 1(f)) 

FIN2 : f ( FIN2 ( f )+l ) <= f(FIN2(f)) 

These axioms can be instantiated with any function "f" of the 
right type. We require that "f" be a computable operator. 
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How does this 


capture .the fact that the range of the cropping 
function is finite? For one, it is a necessary condition. If 
the range of the cropping function is finite, then the set of 
machine representable numbers is finite, and thus the set of 
values the interpreter can return by evaluating variable-free 
machine terms is even smaller. On the other hand, the axioms 
appear sufficient for practical purposes. The axioms permit 
arguments of the sort that there are no infinite descending (or 
ascending) sequences of machine values. 

If these axioms are to be understood by an interpreter, their 
constructive content must be understood. This is especially 
critical for these existential axioms, since such existential 
statements must actually produce the values they claim exist. 
Fortunately , this poses no problem here since we know no sequence 
of machine representable numbers can keep increasing (or 
decreasing) forever. We can find the place where the sequence 
stops increasing (or decreasing) by just examining the values in 
the sequence one by one. This may not be efficient, but it is 
guaranteed to work since the range of machine representable 
numbers is finite. The interpreter can compute the values of the 
sequence until one with the right property is found. This value 
can then be used for FINl(f) or FIN2(f). Eventually the 
algorithmic part of the axioms can be compiled into simple while 
loops. Here is the while loop for the FIN1 axiom given f. 
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i :*= 0; 

while f(i+l) > f(i) do .1 := i+1 end; 
return (i); 

There is a gray area in NSA where the desire for constructive 
content of the axioms competes with the need for expressing 
idealized computations. This mixture of constructive and 
non-constructive rules does not in and of itself cause the 
interpreter any problem. What the interpreter does not 
understand can not be simplified. This leads to the following 
problem. A theorem (using non-constructive constructs) of the 
form "for all real numbers x— there exists a real number y" can be 
verified as a correct theorem in the theory, but when applied to 
a particular value x the proof expression may not simplify to the 
real number y in normal form. We can take normal form for real 
numbers to mean a variable-free computable term. This diminishes 
the usefulness of the verification. For the number- theor y 
programming logic presented previously it is conceivable to prove 
a meta-theorem that all proof expressions representing natural 
numbers can be reduced to a series of successor functions applied 
to zero. Such a meta-theorem is highly desirable for NSA. Most 
likely it will be easier to define a subset of NSA that can be 
mechanically recognized which can be shown to be normalizable. 
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proved correct in a programming logic version of NSA. As 
mentioned previously we view the example as a test case for 
building such a logic. The programming logic sketched in the 
previous section, while tentative, is adequate for carrying out 
the following proof which is a program. Since the logic was 
constructed so that all the axioms and rules of inference have 
constructive content, the proof can be executed by the 
interpreter. 

The example we shall use is Newton's method for computing the 
square root of any real number. 

Square Root Theorem. All x:rea] . Some r:real . x>l => r*r==x 

A proof of this formula will be a function that can take any 
machine representable number and if it is greater than one, this 
function will produce another machine representable real number 
whose square is infinitesimally close to the original number. 
The remainder of this section is devoted to showing what is 
involved in formalizing the proof of the Square Root Theorem. 

The proof of the theorem will certainly require many of the 
ordinary fact about the ideal real numbers. We will use the 
field axioms and the order axioms without much comment. Note 
that we do not expect any part of the proof dealing with ideal 
real numbers ever to effect the evaluation of a proof 
expression. 
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To prove that the square root exists we will first define a 
sequence of machine representable numbers that get closer and 
closer to the square root of x. Here is the recursive definition 
of the sequence written in ML. 

♦ .it 

. . »■ 

val F (i:Nat): Real = 
if’ i=0 then x 

else let val Next = (F(i-1 )++( x//F(i-l ) ) ) //2 in 
r if Next > F(i-l) then F(i-l) else Next 
end" ; • , ■ 

9 * 

. ^ / 

It is clear that we could have defined F using the Rec and If 

constructs, but such a definition would not be perspicuous. 

Notice that we can prove that F(i) is a machine representable for 

all i, since all the compuational steps (including ++, // and >) 

are all perfectly understandable by the interpreter. 

) * , 

The computation rules permit the following conclusions about 
the recursively defined function F: 

F(0) = x 

F(i) = if Next > F(i— 1) then F(i-l) else Next 
where Next = (F(iO)++( x//F( iO) ) ) //2 

The computation rules for the If construct give rise to two more 
rules which can be used in the proof. 

Next > F(i-l) “(Next > F(i-l)) 

F(i) = F(i-l) F(i) = Next 

Using these rules is the only means of proving properties about 
recursive definitions. We will need these particular rules to 
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continue the proof of the. Square Root Theorem. 


We can prove by induction that this sequence F has the 
property : 


Property 1. All n:Nat . F(n) >= F(n+1) 

Notice that this proof requires no reasoning about floating-point 
calculations whatsoever. The proof relies solely on the 
definition of F. Another property of the sequence F provable by 
induct i on is : 


Property 2. All n: Nat . x >= F(n) >= 1 

That x>=F ( n ) holds, follows from Property 1 about F. That F(n)>=l 
holds, requires knowing something about floating-point 
computations. In particular, we need to know that y>=z implies 
y//z>=l , and that y>=l and z>=l imply (y++z)//2>=l . These facts 
follow from the monotonicity of the cropping function. We will 
give a more detailed proof of Property 2 later. 

Now we give a sketch of the proof of the Square Root Theorem, 
By the finiteness axioms we know that there is some iO for which 
F( iO+1 ) >=F( j 0 ) . This combined with the fact (Property 1) that 
F(n) >=F(n+l ) implies that F(i0+1 )=F(iO) . What does this mean? If 
F(i0+1) equals (F(iO)++(x//F(iO) ) ) / / 2 , then we have what we would 
expect since ideally 
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F(i) = (F(i)+(x/.F(i))/2 implies F(i)*F(i)=x 

Let us set Next equal to the quantity (F(iO)++(x//F(iO) ))//2. Now 
suppose F(i0+1 ) = Next, then we must show F(iO)*F(iO)==x. First 
we must know that Next is infinitesimally close to its ideal 
counterpart: <F(iO)+(x/F(iO)) )/2 . (This result is proved in Lemma 
1 below.) Hence (F(i0)+(x/F(i0)))/2 == F(iO). From this follows 
(Lemma 2) the desired result. The proof is remotely similar to 
the ideal mathematical case, but there are many details to 
check. The floating-point computations do have the needed 
properties like their ideal counterparts do, but to verify this 
requires more effort and we put this off for the moment. 

The proof is not yet finished. It need not be the case that 
F(i0+1 ) = Next. Recall the definition of the function F. If Next 
> F(iO) then F( i 0+1 ) =F( iO) . This would be the case when cropping 
errors in the computation of the next value in the sequence did 
not result in a value that was less than or equal to the previous 
value. But nevertheless we have F(iO)*F(iO)—x, since Next is 
really very close to F(iO). In fact we can prove that Next > 
F(i0 ) implies that (F(i0)+(x/F(i0» )/2 == F(i0). This is the 
content of Lemma 3. We defer this proof as well. F(iO)*F(iO) *== 
x follows again from Lemma 2. 

All that is needed to complete the proof is to put the two 
cases, Next <= F(i0) and Next > F(i0), together. The cases are 
exhaustive (by dichotomy) and in each case we have the desired 
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conclusion. This suggestes the or-elimination rule. 

OrElim (Dichotomy(Next, F(iO) ) , easel, case2) 

Finally, we pick as the square root F(iO). The proof expression 
for the whole proof takes on the following form. 

Alllntro (x, Somelntro (S, OrElim (...), F(iO)) 

We have just seen an overview of the proof of the Square Root 
Theorem. It is time now to go back and fill in the details. 

First we prove Property 2. The proof proceeds by induction. 
For n=0 we must show that x>=F(0)>r=l. Since F(0)=x and we 
assumed x>=l, this is trivial. So now we assume the induction 
hypothesis x>=F(n)>=l and prove that x>=F( n+1 ) >=1 . Set Next to 
be (F(n)++(x//F(n)))//2. If Next>F(n) then F(n+l)=F(n) and we are 
finished. Otherwise Next<=>F(n) and F( n+1 )r=Nex t . Since x>=F(n), 
x>=F(n+l). Now comes the hard part: showing F(n+l)=Next>=l . We 
must analyse the floating-point operations in 

Next=(F(n)++(x//F(n)))//2. Since x>=F(n) and “(F(n)=0), we expect 
that x//F(n)>=l. This is in fact the case. Since F(n)>=l, we 
expect that F(n)++(x//F(n))>=2. Finally dividing by 2 we get 

Next>=l , the desired conclusion. 

The properties of the floating-point operations used above do 
follow from the axioms of NSA. Let us examine one of these facts 
m greater detail. From y>=z it follows that y//z. Expanding 
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the definition of y//z .we get “(z=0) & w=CR(y/z) & w<=l. 

Assuming that ~(z=0) we get that y/z>=l by the axioms of ideal 
arithmetic. By the monotonicity of CR we have CR(x/y)<s=CR(l) = l . 
Hence w<=l. 

For Lemmas 1 , 2 

(F(iO)++(x//F(iO)))//2 
(F(i0)+(x/F(i0)))/2. 

Lemma 1 states that 
follows . 

x//F(iO) == x/F(iO) 

F(iO)++(x//F(iO)) == F(iO)+(x/F(iO)) 

(F(i0)++(x//F(i0)))//2 == (F(iO)+(x/F(iO)))/2 

Each one of these steps depends on a similar argument about 
floating-point calculations which makes use of the fact the 
fin(x) implies CR(x)==x. So, each step reduces to showing that 
the appropriate quantity is finite. In the first step, for 

example, we must show that x/F(iO) is finite. But that follows 
from the fact that x/F(iO)<=l<2 and that 2 is standard. 

Lemma 2 states that IdealNext =»» F(iO) implies F(iO)*F(iO) == 
x. The proof proceeds as follows. 

(F(i0)+(x/F(i0)))/2 == F(iO) 

F(iO)+(x/F(iO)) == 2*F(iO) 
x/F ( iO) == F(iO) 
x == F(iO)*F(iO) 

Each step follows from a simlar argument about floating-point 


and 3 we set Next equal to 
and IdealNext equal to 


Next =— IdealNext. The proof proceed 


s as 
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computations. The essence of the first step is y/2==z implies 
y=»2*z. Expanded this yields inf(y/2 - z) implies inf(y - 2*z). 
This follows from the fact that for all epsilon 

|y/2 - z| < epsilon/2 implies |y - 2*z| < epsilon 
Recall that inf(x) is defined to be 

All epsilon . (std(epsilon) & epsilon > 0) -> |x| < y 

Lemma 3 states that Next > F(i0) implies IdealNext == F(i0). By 
the laws of ideal arithmetic we have that IdealNext < F(i0). 
Hence IdealNext < F(i0) < Next. From Lemma 1 we have that Next == 
IdealNext. Clearly IdealNext == F(ip). 
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Chapter 5 


Technical Feasibility 

This report, although incomplete in places, shows the 
feasibility of our apprach to the formal specification and 
v ® f f i ca t i on of mathematical software. Our basic concept, the 
use of non-standard analysis to represent the asymptotic behavior 
of programs, is new; there does not appear to be anything 
comparable to it in the literature. Further experimentation with 
approaches is necessary before an appropriate verification system 
can be designed. We believe such experimentation is beet carried 
out using rapid prototyping. An experimental VCG can be built 
without an accompanying theorem prover and used to examine the 
forms of the VCs generated; simplification rewrite rules over the 
non-standard reals can be devised in order to simplify the print 
form of the VCs; our ML prototype should be completed in several 
different ways and experimented with. 

® ur final vision is a system in which a mathematically 
sophisticated programmer /mathematician could interactively verify 
libraries of floating-point routines or critical sections of 
large systems which use the floating point data type. These 
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verified programs might. then be transfered to other machines 
following the Host/Target scenario familiar in embedded systems. 


The reason for 

such 

configurations is 

that environments useful 

for program 

development (including 

formal 

specification/ 

verification) 

are 

not necessarily 

optimal 

for run-time 

requirements 

like 

advanced floating 

point 

precision and 


efficiency. The portable programs produced by our verification 
environment can then be used with far greater assurance of their 
reliability. Indeed, our asymptotic approach to verification is 
consistent with and supports the use of verified programs on a 
variety of m a chines. 

Our greatest departure from mainstream efforts in program 
verification is in using non-standard analysis. This is at once 
the most risky and the most innovative aspect. In the past 20 
years the logical basis of non-standard analysis has been worked 
out but mainly by mathematical logicians as opposed to computer 
scientists. Thus tasks of building formal languages with their 
accompanying grammars and parsers which express these concepts 
and automated proof environments which manipulate the constructs 
are open research areas. The only applicable work in automated 
theorem proving which we are aware of is [1]. 

While very little precedent for our apporach is available we do 

1. Ballantyne and Bledsoe, ’’Automatic Proofs of Theorems in 
Analysis Using Nonstandard Techniques," JACM. 24 fJulv 19771. 
pp. 353-374. 
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feel that based on our experience so far, such an approach 
appears to be conceptually simpler than conventional techniques 
which rely on bounding machine operations on floating-point 
numbers to within an "epsilon" of the actual result. The proofs 
using such rules are difficult and unenlightening. However, 
statements about the asymptotic precision of programs, like 
statements about the asymptotic complexity of programs, make 
meaningful assertions about programs and at the same time permit 
an intuitive theory to be developed. This is the advantage of 

using non-standard analysis as the theoretical underpinnings of 
verification . 

Using non-standard analysis as the theoretical basis, we have 
discussed building a verification system using two different 
approaches with proven feasibilty. There are several well-known 
verifying systems based on the VC approach. There are, for 
example, the Stanford Pascal Verifier [2], the Gypsy Verification 
Environment [3], and the not yet completed Euclid Verification 
System [A], None of these systems support either fixed or 
floating point reals. The success of the VCG approach depends on 
constructing a "good" theorem prover. Such a theorem prover 


2. W. Polak, 
Notes in Comput 


Compiler Specification and Verification 
er Science, 124, Springer-Ver lag , 1981. 


Lecture 


3. Donald Good, et al, tJsing the Gypsy Methodology, University of 
Texas, Austin, 1981 “* 


4. D. Craigen, Ottawa Euclid and EVES: A Status Report . Proc. 1984 
bymp. on Security and Privacy, IEEE Computer Society 
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should prove trivial theorems and simplify non-theorems 
automatically while supporting a user-machine interaction to 
prove more difficult theorems. Finding the right mix between 
automatic and proof checker mode is the subject of much current 
research. Using non-standard analysis as the underlying theory 
causes no additional burden, since it can be adequately 
axiomatized in first-order logic. 


The second approach to verification using non-standard analysis 
that we have proposed using is the programming logic approach. A 
system based on this approach is presented in [5]. This is a 
programming logic adapted to a variant of the PL/I language 
(without real data types). The PRL (for Program Refinement 
Logic ) project at Cornell University [6] is a continuing NSF 
sponsored reseach effort along these lines. While it is not yet 
clear if the formalization of non-standard analysis in this 
framework is flexible enough, the benefits of success would, 
However, be great. First of all, all verified programs 
terminate. The system proves total correctness and not just 
partial correctness. All of the VCG environments mentioned 
previously consider only partial correctness. Second, rapid 
prototyping and experimentation with the logic is possible using 


5. Constable, R., 
Programming Logic . 
Spring-Verlag" 1^52 


et al, Aii Introduction to the PL/CV2 
Lecture Notes in Computer Science 135, 


6. Constable and Bates, "The Nearly Ultimate 
University Technical Report, January 1984. 


Pearl", 


Cornell 
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or more 


the programming language ML. Third, meta-reasoning 
abstract reasoning would be possible as in the PRL project. 
Finally, decision procedures can easily be incorporated to prove 
the trivial details. The drawback of the programming logic 
approach is that it does not produce programs in the imperative 
form that programers are used to. This leads to a possible 
acceptance problem. If previously compiled library routines need 
only be linked and used and not modified then there is no 
difficulty. But if verified programs need to be modified the 
programming logic route would entail training in new language. 

The VCG approach also leads to modifications difficulties since 
the verification is nullified when changes are made. It is 

difficult to verify a program that one hasn't written and also 
difficult to reverify a program which one has written but which 
has been modified by someone else. On the other hand, VCG based 
environments can be designed using data base capablities which 
minimize the re verification effort. If modification and 
non-verification expert readability is a concern then the VCG 
approach should be tailored to known languages like FORTRAN, 
HAL/S, or Ada even though they are not as well structured from a 
verification point of view as is Euclid or Gypsy. 
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Appendix A 


Proofs 


This Appendix contains the proofs of several theorems which 
were used in the course of proving VCs for examples in Section 4. 

THEOREM Is CR is monotone 

Suppose not, i.e. suppose that there exist x and y such that x 
<= y but CR ( y ) < CR(x). 

Case 1: CR(y) <= x 

In this case, CR(y) <= x <= y, and CR(CR(y)) = CR(y) by the 
second cropping function axiom. Therefore, by the fourth 
cropping function axiom, CR(x) = CR(y), a contradiction. 

Case 2: x < CR(y) 

In this case, x < CR(y) < CR(x), and CR(CR(x)) = CR(x), so by the 
fourth cropping function axiom, CR(x) = CR(CR(y)) = CR(y), a 
contradiction . 

THEOREM 2: There is no machine real strictly between x 
and CR(x) 
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Suppose not, i.e. that there exists x and y such that y is 
strictly between x and CR(x) and CR(y) = y. CR(CR(x)) « CR(x), 

so by the fourth cropping function axiom, CR(y) = CR(x), a 
contradiction . 

THEOREM 3: If (1 + X/N) == 1 and J is a standard 
integer, then (1 + X/N) A J == 1 

The proof is by induction on J. Note that the statement we are 
trying to prove is external, so induction will only prove it for 
standard J, but this is all we want. 

For J = 0 the formula is trivially true. Now suppose (1 + 
(X/N))*J 1. (l + (I/N) ) == 1, so (l + (*/„)) lB finlte and „ e 

can multiply both sides of the inductive hypothesis to get 


(1 + (X/N) ) A ( J + 1) .« (i + (X/N)) == 1 
and so the theorem is proved for all standard J. 


THEOREM 4: If Z is a finite real and J is a 
integer, then Z A J is a finite real. 


finite 


The proof is by induction on J. Again, induction will only prove 
the statement for J standard. Since all finite integers are 
standard, this will prove the theorem. 


If J 
Z A (J+1) 


0, Z A J = 
Z A J * Z 


1, a finite real. Now assume Z*J is finite. 
Z A J and Z are both finite, so their product is 
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finite. This finishes the induction. 

THEOREM 5: If fO is a continuous function from R 

to I?, f the non-standard extension of fO, and g(x) == f(x) 

for all finite x, then for any finite x, there exists a standard 
y such that y == x and fO(y) == g(x) 

First of all, the non-standard analysis statement of M f0 is 
continuous” is 

all x,y : R [std(x) & x == y -> f(x) == f ( y ) ] 

Since x is finite, there is a standard real y infinitely close 
to x. By the above statement of the continuity of fO, f(y) == 
f ( x ) . y standard implies that f(y) = fO(y). Therefore fO(y) == 

f(x). x finite implies that f(x) == g(x). The theorem follows 
by transitivity of == . 
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